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PREFACE 


The editors have pleasure in presenting this volume of our review series. 
We have specialised in three areas: perturbation Monte Carlo, non-linear 
kinetics and the transfer of radioactive fluids in rocks. These contributions 
are linked, however, in the demands for optimising complex systems that 
are a feature of the scale of nuclear power production. 


Kuniharu Kishida’s account of Japanese thinking in the application of 
modern non-linear theory to reactor kinetics and control comes at a time 
when the community of control scholars is seeking how to apply the new 
ideas that have led to the prominence of chaos theory to our field. Prob- 
lems of maintenance in power reactors are as severe as ever and must be 
solved for credibility to characterise any new program. As much as 30% of 
unanticipated down-time, for example, is due to the failure of motor oper- 
ated valves. We need a theory to provide for preventive maintenance. This 
in turn depends heavily on on-line monitoring to anticipate failure as well 
as expert systems to schedule preventive treatment. Noise theory with its 
promise of on-line interpretation of information from inchoate breakdown 
is the key. It is all too likely that the need to deal with major departures 
makes a non-linear theory of noise essential. We can be grateful that Pro- 
fessor Kishida has provided us with such a consistent account. 


The power of computers descending to the workstation and to the PC 
is now such as to make elaborate calculations of realistic systems available 
routinely to every engineer. One PC can be said to equal all the calculat- 
ing power available to the Manhattan Project and the early power reactor 
designers. In such modelling, stochastic simulation, or the Monte Carlo 
methods of nuclear history, plays a dominant role if fully dimensioned sys- 
tems are to be treated. But the engineer must do more than model; the 
professional needs to use these models not just to describe a system but to 
optimise it. Such a challenge calls for the repeated calculations of variations 
on a theme, with perturbation theory leading to optimisation methods a 
desiderata that imposes its burden on Monte Carlo theory. 
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Herbert Rieff, who has devoted his professional life at Ispra to the devel- 
opment of Monte Carlo methods, is supremely well positioned to review for 
us perturbation methods in Monte Carlo. Although these may have been 
developed principally for neutron and neutral particle transport, their appli- 
cability extends to charged particles and, in a different guise, the modelling 
of system reliability. These methods have a profound role to play in the im- 
provement of power station systems, and are therefore the common theme 
of this volume. 


Mike Williams in the third contribution has shown us how the ideas of 
neutron transport theory are applicable to the transport of fluids through 
fractured rocks and hence to the question of the possible spread of radioac- 
tivity from a waste repository. We may well take the view that until the 
public is satisfied as to the safety of every part of the nuclear power process, 
an essential ingredient in the complex pattern of nuclear power systems is 
absent. His work comes from a long involvement in theoretical transport 
theory complementing Monte Carlo studies, that springs not only from the 
insight of Case and Zweifel into singular eigenvalue theory but, even earlier, 
from seminal work by Boris Davison and Eugene Wigner. 


So let us not leave our review without paying homage to the first nu- 
clear engineer. Eugene Wigner died January 1, 1995. Wigner’s mastery of 
theory, in physics and mathematics alike, made him a dominant figure in 
nuclear theory, with a well-deserved Nobel Laureate in 1963. His application 
of group theory to the nucleus was underrecognised perhaps even in his lau- 
rels. The Breit-Wigner formula, Wigner-Seitz approximations, and Wigner 
energy are just some of the theories and approximate methods he generated 
in that era before digital processors. His scholarly account of neutron trans- 
port theory with Weinberg remains stimulating reading. 


Wigner’s role as a nuclear engineer rather than nuclear physicist, how- 
ever, is predicated on his leadership during the Manhattan Project during 
the Second World War, of the team that designed and saw to the construc- 
tion by the Dupont company of the Hanford plutonium production reactors. 
Wigner went ‘on loan’ from Princeton to the first chain reaction project un- 
der Fermi’s leadership at Stagg Field; Wigner had equal right to be recog- 
nised as mastering the theory of the nucleus as Fermi or indeed Szilard or 
the other European expatriates. But by taking the performance of this first 
fissioning device at a power of milliwatts to the 500 MW production reac- 
tors, in a design period of four months, Eugene Wigner must be recognised 
as the father of his profession, the first nuclear engineer. His preparation, 
from Lutheran gymnasium in Hungary to Chemical Engineer at the Berlin 
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Hochschule, was vital in providing that broad understanding of the many 
aspects of engineering that were essential to the success of this first and all 
subsequent nuclear engineering projects. We were proud to have him grace 
our Editorial Advisory Board. 


Ill health clouded the last years of Wigner’s life. We should, however, 
reflect on the training and discipline that made him a giant of the twentieth 
century, Father of Nuclear Engineering. 


It remains but to thank our authors and commend their work to our 


readers. 


Jeffery Lewins, 
Martin Becker 
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CONTRACTION OF INFORMATION AND ITS INVERSE PROBLEM 
IN REACTOR SYSTEM IDENTIFICATION 
AND STOCHASTIC DIAGNOSIS 


K. Kishida 


Department of Applied Mathematics 
Faculty of Engineering 
Gifu University 
1-1 Yanagido, Gifu 501-11, Japan 


I. INTRODUCTION 


Research concerning power reactor noise analysis 
makes rapid progress in the areas of system identifica- 
tion, prediction and diagnosis. Keywords in these studies 
are artificial intelligence, neural network, fuzzy, and 
chaos. Nonlinear, nonstationary, or non-Gaussian processes 
as well as linear and steady processes are also studied in 
fluctuation analysis. However, we have not enough time to 
study a fundamental theory, since we are urged to obtain 
results or applications in power reactor fluctuations. 
Furthermore, we have no systematic approach to handle 
observed time series data in the linear process, since 
power reactor noise phenomena are complicated. Hence, it 
is important to study it from the fundamental viewpoint. 
It is a main aim of the present review paper to describe a 
unified formalism for reactor system identification and 
stochastic diagnosis. 


Reactors are complicated systems in size, reactions, 
and degrees of freedom. In such a system, the approach of 
understanding in statistics or statistical physics is use- 
ful. An essential concept is the contraction of informa- 
tion. For example, Brownian motion is described by the 
projection onto the motion of a colloid particle only. The 
microscopic motion of molecules of the surrounding is 
eliminated by the projection, since the molecule positions 
are not observable. Due to the coarse graining in a spa- 
tial-time scale as in statistical physics [1], the pro- 
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jected motion is described by a master equation. In an 
ideal Brownian motion, the process is Gaussian and Marko- 
vian. However, a Markovian description is not always 
possible, and the contraction of information results ina 
non-Markovian property. The contraction is a projection of 
the object onto a cross section of our understanding. We 
call the cross section a "window of observation" in space 
and time scales. In fluctuation analysis the contraction 
of information is also useful in engineering plants. In 
the description of fluctuation phenomena in reactors, the 
manipulation of coarse-graining in a space-time domain is 
essential in modeling. The motion described by the master 
equation is also equivalently expressed by a Langevin 
equation. The stochastic equation is a bridge from the 
microscopic world to a macroscopic world, or a connection 
between a lower stage and a next upper stage in coarse- 
graining levels. Microscopic fluctuations are related to a 
macroscopic coefficient as in the fluctuation dissipation 
theorem or the Nyquist theorem [1]. Hence, the concept of 
the window of observation is essential in noise analysis. 





Reactor noise analysis has a history of more than 
three decades, and has two main results. One is the estab- 
lishment by 1975 of zero power reactor noise theory. The 
other is the application of a statistical method with 
multivariate time-series models to power reactor noises 
since 1976, especially the autoregressive model [2,3]. 
There are several methods or formalisms to describe noise 
phenomena occurring in reactors. Review books were pub- 
lished by Thie [4,5], Uhrig [6], and Williams [7]. 


The theory of zero power reactor noise is well estab- 
lished as mentioned by U. Farinelli in the first Special- 
ist Meeting On Reactor Noise (SMORN-1, 1974, Italy). The 
review paper by Saito [8] is the best summary paper of 
zero power reactor noise theory. Pacilio et al [9] also 
wrote of it in this series. In a suitable sampling time, 
noise phenomena of the zero power reactor can be described 
by the one-point model. Zero power reactor noise theory 
has been established on the assumptions of (a) Markov 
process, (b) linearity, (c) stationary process, and (d) 
ergodicity. The conditions (b), (c), and (d) are related 
to the stability of reactors. The reactivity of a reactor 
is considered to be subcritical in the case of a zero 
power reactor, since zero power critical reactors have 
increasing variance with time. After the success of reac- 
tor noise theory in a subcritical state, the reactivity is 
assumed to be always subcritical in the theoretical treat- 
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ment. In the scalar model description of reactor, we must 
consider nonlinear effects, when fluctuations are enhanced 
by a supercritical reactivity. On the other hand, at-power 
reactors have complicated reaction networks, especially 
feedback mechanisms. A reactor with the potential for 
supercriticality can be steady-operated with a negative 
feedback mechanism, since the total system of the reactor 
is stable. In the vector model description of power reac- 
tors, it is a hard task to evaluate stability. It is 
necessary to develop a theory of reactor diagnosis for 
power reactors with a positive reactivity and a negative 
feedback mechanism. 


The method of time series analysis has been used for 
identification of power reactors described by the repre- 
sentation of transfer function models. It is a fundamental 
task to examine transfer functions in a power reactor from 
the viewpoint of system identification, control and diag- 
nosis. For identification of transfer functions, the 
autoregressive (AR) model is used mainly in reactor noise 
analysis [10-18], which has the well established recursion 
formula for AR coefficients and the information criteria 
for determination of model order [2,19]. From the statis- 
tical standpoint and for development of reactor diagnosis, 
benchmark tests were held in the series of SMORN confer- 
ences [14,16,17]; the computational benchmark test in 
1981, the physical benchmark test in 1984, and the tests 
on AR modeling and anomaly detection in 1987 and 1990. 
Particularly in the benchmark test of the SMORN-V confer- 
ence, there was a question to determine open loop transfer 
functions in a stochastic feedback system from time series 
data. In the summary report of the benchmark test, it was 
concluded by Hoogenboon et al, [20] that no satisfactory 
agreement consisted in the results of the AR modeling of 
the proposed system. Hence one of aims of the present 
paper is to examine in principle whether benchmark test 
can be answered or not in the determination of open loop 
transfer functions. 


From the viewpoint of system identification and 
diagnosis, it is important to pay attention to innovation 
models, which lie in an intermediate stage between physi- 
cal models and time series models in reactor noise analy- 
sis, when we examine reactor noise phenomena by using a 
time series model. When a gap between a physical model 
describing a system and a time series model fitting time 
series data is small and also the number of output varia- 
bles is single, it is possible to describe reactor noise 
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phenomena without innovation models. This corresponds to 
noise theory of the zero power reactor, and then little 
attention is paid in zero power reactor noise analysis. 
However, the gap between a physical model and a time 
series model becomes large, if a system has complicated 
reaction networks and is described by multivariate 
input/output variables. In such a situation an innovation 
model is indispensable for reactor noise analysis. A 
suitable example of the innovation model can be found in 
the pole analysis of multivariate time series model. In 
the present paper properties of innovation models are 
examined from the viewpoint of system identification and 
diagnosis. 


Suppose a state space model be physically derived 
from a master equation in the Markov process by using the 
system size expansion method [21]. From the viewpoint of 
contraction of information, a theoretical innovation model 
is derived from the state space model under conditions 
where observable time series data are available in signal 
processing. Under the assumptions of controllability and 
observability a theoretical autoregressive moving average 
(ARMA) model is derived from the theoretical innovation 
model. The approach belongs to a forward or direct problem 
from a physical model to a time series model. 





In contrast, a numerical algorithm to construct an 
innovation model from time series data was developed 
recently. The opposite direction to the forward problem is 
an inverse problem. Let an innovation model, of which 
coefficients are determined from time series data, be 
called a data-oriented innovation model. It is important 
to understand properties of the data-oriented innovation 
model and examine its structure in more detail. Tt will 
be reported in the present paper that the data-oriented 
innovation model is indispensable for reactor system 
identification and stochastic diagnosis. 


In Section II, a unified theory will be presented, 
and properties of both a theoretical innovation model in 
the forward problem and a data-oriented innovation model 
in the inverse problem will be discussed. In Section III, a 
physical foundation of a discrete-time system model ob- 
tained by the system size expansion method and the non- 
Markovian effect due to a hidden variable will be report- 
ed. The formalism will be applied to the AR model analy- 
sis in Section IV, and provide two rules of AR poles 
and weights. In a feedback system, we can discuss system 
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identification from viewpoint of the formalism presented 
in Section V. 


II. STOCHASTIC MODELS AND THEIR PROPERTIES 


II.1. System Model and Theoretical Innovation Model 
(Forward Problem) 








Let a system be described by the state space repre- 
sentation of d-dimensional state vector x(n) with d-dimen- 
sional white Gaussian random force f(n), 


x(n)=®x(n-1)+f(n), (1) 
with E{f(n)}=0 and E{f(n)f(n')T}=Vô pn'> 
where E{---} denotes the ensemble average, T denotes the 


transposition of matrix, ô nn' is Kronecker's delta, ® 
and V are dxd constant matrices and n is discrete time. 
Hereafter, it is assumed that the mean E{x(n)}=0 without 
loss of generality, that eigenvalues of @ lie inside the 
unit circle in the complex plane, and that rank V=d. The 
case of rank V<d will be presented in a feedback system in 
Section V. The physical foundation of the model will be 
discussed in Section III. 1. For the Markovian description 
sufficiently large number of state variables must be 
observed physically. If, however, a few of them are meas- 
ured, let the observation be expressed by 


y(n)=Hx(n), (2) 


where xeR@, yERI, and perax d (q<d). The projection due to 
the observation Eq. (2) brings about the contraction of 
information in the system, and results in two equivalent 
innovation representations, Eqs. (3) and (4) and Eqs. (19) 
and (20): The first type of an innovation model [22,23] is 


x(n|n)=®x(n-1|n-1)+Ky7 (n) (3) 

y(n)=Hx(nIn), (4) 
where the conditional vector x(kln) ae Oo ead under 
the condition that Y(n):=[y(n)T, y(n-1)*, we) T is given, 


an innovation 7 (n):=y(n)~y(nin-1) with r (n)=E{7(n) 7 (n)7} 
=HP(n)H!, P(n) :=E{(x(n)-x(nin-1)) (x(n)-x(nin-1)7}, and a 


constant matrix K:=lim P(n)HTr (n)7}. Let lim P(n)=P and 
n= œ S 


lim F (n)= , then from Eqs. (1) and (3), we have a matrix 


n= œ 


Riccati equation of P, 
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P= {P-PH'r ~-lyp}oTev. (5) 


We call Eqs. (3) and (4) the first type of innovation 
model. Here it should be noted that the d-dimensional 
random vector f in the system model (1) is equivalently 
converted into the q-dimensional innovation y in the 
innovation model (3) , as it will be shown concretely in 
terms of conditional probability in Section III.2. The 
equivalence can be also recognized by comparing correla- 
tion function matrices. Measurable correlation function 
matrices of Eqs. (1) and (2) are 


Cyy(n) :=E{y(n)y(0)7} 
=HE{x(n)x(0)T}HT=HoC,,(0)H!, (6) 


where Cyy (0) is a solution of the following Einstein or 
Lyapunov equation 


z T 
Cyx (0) =OC,, (0) 0 +V. (7) 
For n<0, Cyy(n) =HC,, (0) (0T) IHT, The correlation function 
matrices can be rewritten in terms of Eqs. (3) and (4) as 
co 
ay n k Ti aTykyT 
Cyy(n) =H® Ae Er K*(®*)*H*, (8) 


since the Riccati equation (5) plays an important role in 
the equivalence [23]. 


Under the controllability and observability condi- 
tions, we obtain an ARMA model by eliminating x(n|n) from 
Eqs. (3) and (4), so that 


A(z~1)y(n)=B(z71) 7 (n), (9) 


where det A(z 1)=det(I-@z71), det B(z-l)=adet S(z71), «æ 
is a suitable constant, z* is the backward time shift 
operator: z~ly(n)=y(n-1), I is the unit matrix, and S(z71) 
is the system matrix defined by 


oz 1-1 K 
H o |. (10) 


The "physical" or "theoretical" ARMA model without statis- 
tical errors, Eq. (9), is different from numerical ARMA 
models in the time series analysis, since it can be de- 
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rived theoretically from a physical equation as in Section 
III. 1, and hence the coefficient matrices of (1) are de- 
scribed in terms of physical parameters. Since we can 
rewrite Eqs. (3) and (4) as 


x(nin) 0 
$(z71) = 
a b= | | al a 


then the transfer function matrix Gy(z71) of the innova- 
tion model is given by H(I-@z71)-1K for the system matrix 
(10). From the unimodular transformations, Gy(z71) is also 
obtained from Eq. (13) as 


Io |74}E(274) 
Gy(z-1)=-(0 I 
Rl oe } i 4 kani 


=A(z~1)-1p(z7}), (12) 





where A(z~!)~1lp(z271) is the left matrix-fraction descrip- 
tion of the transfer function matrix. We will show it 
briefly, since details were discussed by Yamada and Kishi- 
da [24]. AR and MA coefficient matrices of the ARMA model 
are derived from pre- and post-multiplying S(z71) by 
appropriate unimodular transformation matrices. Here AR 
and MA matrices of Eq. (9) will be related with S(z71) as 


Qı O} (Liz) o}fT Al s(z71) [S 0] [R(z) 0] [Q. 0 
o Lo lot OIj, oO Ijjo 1 


I 0 E(z71) 
=| 0 -A(z-1) B(z7+) (13) 
0 I 0 


A simple example of above matrix calculations will be 
written in Section III.2. Each unimodular matrices in Eq. 
(13) are expressed as following: 

G) T=(hy (OT)hy = (@7)%1-thy = hg * (07) 7g *hglt, 
where 9; is Kronecker's index! (24.90 ;=d), and hi is the 
i-th row vector of H. This matrix transforms (@z~1-1) 
into the Luenberger observable canonical form. 

(ii) S is the inverse matrix of T: S=T7l. Then, we have 





1 As in the literature of control theory [56,83], Kroneck- 


er's index is used for the description of minimal canoni- 
cal forms for multivariate input-output controllable and 
observable systems. The form in Eq. (14) is the observable 
canonical form, SSF4. 
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E | $(z71) | | -f o'z`l-I j 

0 I © I H' 0 |; 

where ®'=T®S, K'=T K and H'=H S. Here H' is written as 
H'=quasi diag{h;}, Gaie 2) aa 9Q@) 

with hy=(1 O ... 0), and the observability canonical form 


o 
®' has submatrices: 


®'={0j;}, (Gisele 25 cca a (14) 

where 

O 1 0 . 0 

0 0 10 0 0 0 
Oi4= . Z a ee ay Of5= ’ ; . g j 

ý s R. i o kei ae 

O 5% 2» 4 & € OE OD aigig, + + + ajj ib 

0 zeon Jen oj J 

Fit.oy a i oh) i 

7i 


with agg nyo? Eisi as) (ky=E 42304). Here, sj is 


the i-th column vector of S. 


1 
2 1 
(i111) R(z)=block diag {| , . . (o,=1, 2, en Q)}. 


27imtii.z1 


Then, the polynomial matrix (®'z71-1)R(z) has the same 
structure as ®"' with z7! in place of 1 and ayy (Z7) 
instead of aij,k respectively. Here iz le ) “are de- 
fined by 


-1 sct 
aij, 1(Z ) Z 0. as aaia ea O aij,1 1 
ay ¢z°=) 1 Z x . jja Z 
j.2(2_, . ij,2 3 
aij,3(Z ) = Z L 6. Rt è aij,3 - Z ô ije 
`z 1 2 to 


tij, gZ) zj? ... z 1 z! hay z7 j1 
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(iv) L(z) is a matrix with the submatrices: 


L(z)={L;;(2)}, (i, j=l, 2, s.. q) 
where 
7 O 
O 0 
Ly4(Z)= 
Z 
-zZ ELETA ECE. EATI Aan te) z9 4th 
O 0 
Lįj(z)= 
i 015,04 a aea a Ea, 0 
After some matrix calculations, A(zt) is given as 
A(z71)=[ 214,0, riyeg Jelly By ssp A) 
si Sy y-Bfiage Taipi 175°" 4. (15) 


The k-th row of B(z71) is equal to the (Eae i)-th ele- 
ment of L(z)K'. E(z *) is the matrix which has the remain- 
der of L(z)K'. 

(v) Q] and Q3 are suitable matrices for interchange of 
columns and rows. 


From the introduction of the system matrix represen- 
tation and Eq. (13), it is concluded that poles of the AR 
part and zeros of the MA part are obtained as roots of the 
following equations 





det A(z~1)=0 <= det (0z`l-I)=0, (16) 
det B(z7})=0 += det S(z7})=0, (17) 


where «<= denotes the equivalence. This is because 


I 0 
-1 -1_ =T 
M(z`>)(0z DN(z ~) k pa 


and 


oon 
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0 0 
I 0 
0 -B(z-+) J, 
where M(z71):=Q,L(z)T and N(z~1):=SR(z)Qo. Hence we can 
define theoretical ARMA zeros as roots of the following 


equation 
ARMA zeros: det B(z~1)=0 or det S(z7!)=0. (18) 


It is concluded that zeros of the innovation model 
are equal to ARMA zeros under the controllability and 
observability conditions: The ARMA model (9) has d-q 
finite zeros and q infinite zeros outside the unit circle 
in the complex plane of z` 


Although the first type of an innovation model of 
Eqs. (3) and (4) is equivalent to Eqs. (1) and (2), it 
should be noted that we can have another representation of 
the innovation model, as mentioned by Kishida [25,26], 
since the definition of innovation is y(n)=y(n|n-1)+7 (n) 


=Hx(n|n-1)+7(n). That is, we have the second type of an 
innovation model, 


x(n}n-1):=E{x(n) |Y(n-1)} 
=ME{x(n-1)/Y(n-1)} 
=ME{x(n-1)|Y¥(n-2) }+ME{x(n-1)17 (n-1)} 
=®x(n-1{n-2)+@PH ply (n-1) 
=@x(n-1(n-2)+®Ky (n-1) (19) 


y(n)=Hx(n|n-1)+y7(n). (20) 


The coupled equations have the form popular in the dis- 
crete Kalman filter of control theory, as mentioned in 
Appendix D. The equivalence between the coupled equations 
and Eqs. (1) and (2) is easily understood by the compari- 
son with their impulse responses. In Fig. 1, we summarize 
models in the formalism of a forward problem. 


Substituting Eq. (20) into Eq. (19), we have an 
expression of the innovation model 


x(nIn)=(1I-KH) ®x(n-1|n-1)+Ky(n), (21) 


since x(n|n-1)=®x(n-1|n-1). This gives a stability condi- 
tion of the innovation model (21): If the innovation model 
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Discrete System Model, Eq. (1) 
contraction of information 
by observation, Eq. (2) 


Correlation Functions of Observable Variables, Eq. (6) 
equivalence 


Equivalent Correlation Functions, Eq. (8) 


Theoretical Innovation Model, Eqs. (3) and (4) 
or Eqs. (19) and (20) 
observability and controllability 


Theoretical ARMA Model, Eq. (9) 
ARMA process 


Figure 1. Forward problem from physical equations to 
innovation and ARMA models via observable 
correlation functions. 


is stable, then eigenvalues of the matrix (I-KH)® are 
inside the unit circle in the complex plane. On the other 
hand, zeros of the second type of the innovation model are 
determined as roots of a characteristic equation of gt, 


š z7l-I oKz7} 
det Sg(z7>)= a : =0, (22) 


which is equal to detS(z~1)=0 as proved by Kishida [25]. 
From the determinant formula (A.4) in Appendix A, we have 


detSy(z~)=det{I-o (I-KH)z~+}=0. (23) 


From Eq. (23) it is concluded that the stable innovation 
is a minimum phase model, of which the zeros are invert- 
ible in the zi plane. Here the meaning of minimum phase 
is that all poles and zeros are inside the unit circle in 
the complex plane of z or outside the unit circle in the 
complex plane of zl, since eig{® (I-KH) }=eig{(I-KH)®}. 
Therefore, the zeros of Eqs. (19) and (20) are invertible 
from the stability of Eq. (21) and this property is relat- 
ed to the stability solution of the Riccati equation (5). 
As in Appendix C, stable eigenvalues of the generalized 
eigenvalue problem of a Riccati equation are related to 
ARMA zeros, which are the zeros of innovation models under 
the controllability and observability conditions. Hence, 
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it is concluded that the ARMA zeros are invertible, and 
they are reciprocal numbers of the eigenvalues of ®(I-KH) 
in the z` plane. 


Under the assumption of rank V=d in Eq. (1), we have 
developed our formalism. In different situations where 
there are several cases with rank V<d, our formalism can 
be easily modified by rewriting both, f(n) and V as both 
Gf(n) and GVG! as in Eq. (5.6) or Eq. (6) of Section V. 


II.2. Data-oriented Innovation Model via Singular Value 
Decomposition (Inverse Problem) 








To obtain an ARMA model numerically we must find an 
algorithm to determine three matrices ®, H and K of the 
innovation model, Eqs. (3) and (4) or Eqs. (19) and (20). 
Kishida [25-27] has already reported a method to determine 
matrices of an innovation model in the inverse problem via 
the singular value decomposition of Hankel matrix, of 
which the elements are observed correlation functions. 
Here we will review it briefly, though a similar theory 
was already reported by Desai [28] and Aoki [29]. 


From Eq. (5) we have a relation among P, Cyy (0) and a 
new matrix A, 


P=Cyy(0)-A, (24) 


Sak T;@T)k 
where Ai= aa KI K*(®*) is a stable solution of an 


algebraic equation, 
A=@AOlsoxrKlol. (25) 


If we define a new matrix, 
co a) 
Q:= SOKKKT(OT)KAT=KTKTHT+ ART, (26) 


then we can obtain three matrices corresponding to ®, H 
and Q from correlation functions. As in a similar algo- 
rithm in deterministic control theory (Ho and Kalman, 
[30]), we define a Hankel matrix H,,(k), and its associate 
matrices, Ha(k), Hp(k) and He(k), by arranging equivalent 
correlation functions, Cyy(k)=HOKQ, as 
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Catr Ctl) ac Gaal) 
sanar a 
Hyy(k):=| 7 i =0(k)L(k) 
Gyy lt) Cl «ow Byg (ke) 
Gey lh) Cypa) nun Cy hkl) 
Cyy (2) cy (3) a. Cyy(k+2) 
Hao) 7 i -0(k)OL(k) 
Cyy(K+1) Cyy(k+2) ... Cyy(2k+1) (27) 
Cyy (0) 
Cyy(1) 
Hp (kk) : = : =0(k)Q 
Cyy (k) 
He(k)1=(Cyy (0) Cyy(1) ... Cyy(k))=HL(k), 


where O(k)T=(H! THT = (@K)THT) and L(k)=(Q ©Q = oq). 


Output signals in a steady-operated plant are time 
series data with zero mean, Ef{y(n)}=0. Under the assump- 
tion of ergodicity, observed correlation functions can be 


calculated from N time series data, (y(n), n=l, 2,..., N): 
1 +N 
Ryy(n)= lim — ¥(y(k+n)y(k)7). (28) 
N>œ N ķk=1 


Using the singular value decomposition of the Hankel 
matrix Hyy, we can calculate three matrices A, B, and C 
from the Hankel and its associated matrices, Hyy» Ha» Hp: 
and He, of which the elements are observed correlation 
functions Ryy(n). That is, we have 


©: A=Ej)/2uTH, (kK) VE kl? 

» Biwxrrl/2yt 

Q: Bg=24)/2utHR(k) (29) 
H: C#Ho(k)Vy3,1/2, 


where Hy (kK) =U, E kVE, Zy, is a diagonal matrix of which the 
JragonaT elements consist of d non-negative singular 
values, Uk and Vk are orthogonal matrices, and k is a 
positive integer which satisfies a condition, q(k+1)>d. 
Furthermore, a following procedure is added to the numeri- 
cal algorithm in order to obtain K from ®, H and Q. 
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Since HK=HPH(HPHT) 1-1, we have from Eq. (26) 
K=(Q-AHT) rL. (30) 


To determine A in Eq. (30), we must solve a Riccati type 
equation of Eq. (25) 


A= A T+ (Q-AH™) (Cyy(0)-HAHT)“1(Q-AH")TOT, (31) 


since Cy (0)=r+HaAH!, After numerical calculations of A, 
B and y in Eq. (29), we can have the stable solution 
E for A from Eq. (31), EsAEAT+A(Bg-EC™) (Cyy(0)-CECT)~4 
+ (B ,-EC*)+*A*. Then, we can calculate B corresponding to K 
by substituting 4. Bg» C and Cyy (0) into Eq. (30): 
B=(B -ECT) (Cyy (0) -CEC yok, Finally we obtain a data-ori- 


ented innovation model for Eqs. (3) and (4), 
x(n|n)=Ax(n-1}n-1)+By (n) (32) 
y(n)=Cx(nIn), (33) 

In comparison with Eqs. (3) and (4) there is a transforma- 


tion matrix, T, in the representation of data-oriented 
innovation model; A=T oT, B=T-1k and C=HT, since U, and 
Vy are not unique in the decomposition of the Hankel 
matrix. Similarly, we have a data-oriented innovation 
model for Eqs. (19) and (20) 


x(n|n-1)=Ax(n-1[|n-2)+AB7 (n-1) (34) 
y(n)=Cx(n|n-1)+7 (n), (35) 


since CB=HTT~1kK=HK=HPH?r-t=rr-l=I. In Fig. 2, we can 
summarize models in the formalism of inverse problem. 


Attention should be paid to following two points as 
to an undetermined matrix T of the data-oriented innova- 
tion model in the inverse problem. One point is that data- 
oriented innovation models, Eqs. (32), (33), (34), and 
(35), have the same transfer function matrix as that of 
the theoretical innovation model, 


G7 (z71):=1+C(1-Az~1)~1aBz~1 (for Eqs. (34) and (35)) 
=C(I-Az~1)-1p (for Eqs. (32) and (33)) 
sH(1-0271)-1K:=6,(z74). (36) 


Since there is a unimodular matrix Utz), of which the 
determinant is constant: A(z~!)=U(z7!)A'(z7l) and B(z7h) 
=U(z~ YB (z71), we have a matrix fraction form of Eq. (36) 
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Time Series Data, {y(n)} 
stationary process 


Observed Correlation Functions, {R y(n)}, Eq. (28) 
singular value decomposition of Hankel matrix 


Data-oriented Innovation Model, Eqs. (32) and (33) 
system matrix or Eqs. (34) and (35) 


Data-oriented ARMA Model, Eq. (37) 


System Poles, ARMA zeros, and Transfer Functions 
of Innovation Model, Eqs. (36), (38), and (39) 


Figure 2. Inverse problem from time series data 
to innovation and ARMA models 


as in Eq. (12): 
G7 (271) :=a' (271) 7-1B (271) ea(z71)-1B( 274). 


Then, a data-oriented ARMA model is also given from data 
as 


A'(z71)y(n)=B' (271) 7 (n). (37) 


The other point is that from two data-oriented innovation 
models, Eqs. (32), (33), (34) and (35), finite ARMA zeros 
can be determined by d-q roots of the following polynomial 


in Z l, 


=detS(z71). 
0 (38) 


F Az~!-1 ABz`1 Az7l-I B 
detS, (z7 )=det =det 


” 


and, from Eq. (16) d ARMA poles in the zi 
obtained as inverse numbers of eigenvalues of A: 


plane are 


System poles: det(1-Az~!)=det(I-@z~!)=0. (39) 


From Eqs. (36), (38), and (39) conservative quanti- 
ties of a data oriented innovation model are system poles, 
ARMA zeros, and transfer functions of the innovation 
model. These values or variables are essential for reactor 
system identification and stochastic diagnosis as in 
Sections IV and V. 
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III. THE PHYSICAL FOUNDATION OF THE SYSTEM MODEL 





III.1 The System Size Expansion Method 





Fluctuations in reactor noise phenomena can be de- 
scribed by a master equation as in statistical physics. 
Noise phenomena in reactor plants are considered to be 
macroscopic. For such a macroscopic system a general 
theory using the system size expansion method was devel- 
oped by Van Kampen [31], Kubo et al [21], Tomita et al 
[32] and Kishida et al [33]. Let X={X;,} (151, 2, ..., d) 
be a set of d extensive macrovariables, and a stochastic 
process be described by a master equation 


2ean [wasr aoras ona awra oaaw ; 
(1) 


where P(X,t) is the probability distribution function of X 
at time t, and W(X,AX) is the transition probability per 
unit time from X to X+AX. Taking the Kramers-Moyal expan- 
sion of Eq. (1), we have 


—P(X,t)= E—(- —) Ke, (x) P(X, t), (2) 
at - k=1 k! ðX 


where Cy (X)= f (ax) Kwex, ax)aax is the k-th moment of the 
transition probability. Since elementary processes in 
macroscopic systems are usually localized, the transition 
probability is written as 


W(X, AX)=Qw(x, AX), (3) 
where Q is the system size, e:=Q7!, and x:=eX is a 
normalized d-dimensional vector. Since fluctuations of 
macroscopic variables are assumed to follow the central 
limit theorem, then x is decomposed into a mean vector, 
y(t), and its fluctuation, F 
xsy(t)+el/2z , (4) 


By using Eq. (4), Cy (x)=e Cy (X) and p(x,t)=e ~9p(x,t), Eq. 
(2) is written as 


ð > ð 
Zp ,t)-e ~1/29( +) —p(¢,t) 
at ð E 
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© e (n-2)/2 ð k 1/2 
SS ete ite th. 


Comparing the term in e~1/2 on both sides, we obtain a 
time evolution equation of y(t) 


d 
apy (tse (y(t). (5) 


From the next order of e we have a linear Fokker-Planck 
equation for §& 





ð 1 
PE. tye y LIEP, as 


JE Acz) tht) 


(6) 


It is easy to show that Eq. (6) is satisfied by the Gaus- 
sian distribution 


P(E ,t)=((22)4deto (t))~1/2exp(-ETe (t)71£), (7) 
where ø (t) is the variance matrix defined by 
ewro- s eTP(E ,t)de. (8) 


In the steady state, Eq. (6) is stochastically equivalent 
to a Langevin equation 





dE db 
d&=LEdt+db or gre ik with nepaci (9) 


ðC (Yg) 

a¥s ' 
D:=colyg), ô(t) is the Dirac delta and a steady mean 
vector Yg abisi T ES c1 (Ys)= =0. It should be noted that from 
Eq. (6) S steady variance matrix og, satisfies the Lyapo- 
nov equation. Lø gt 7 sL +D=0. 


where L:= E{r(t)}=0, E{r(t)r(t')T}=p6 (t-t'), 


By integrating Eq. (9) from (n-1)At to nAt, a Lange- 
vin equation can be written in the discrete-time represen- 
tation of Kishida and Sasakawa [34] as 


x(n)=®x(n-1)+f(n), (10) 


where x(n) is a d-dimensional state vector at discrete 
time n with E{x(n)}=0, f(n) is a d-dimensional Gaussian 
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white random vector with E{f(n)}=0, and E{f(n)f(n')7} 
“VOnn'> Here 6 yy, is the Kronecker delta, and time 
coarse-grained vectors and matrices are 


nAt At 

ea f exp[L(nAt-s) ]r(s)ds, vf exp(Ls)Dexp(L's)ds, 
(n-1)At 0 

x(n)=£ (nåt), and ®=exp(LAt). 


The physical model (10) corresponds to Eq. (2.1), which is 
Eq. (1) in Section II. 


TII.2 Non-Markov Effect and Hidden Variable 





Since the theory of zero power reactor noise is well 
established, noise phenomena in a zero power reactor are 
illustrated to understand the formalism mentioned in 
Section III. 1. The aim of this section is not to introduce 
the noise theory of a zero power reactor, but to show how 
a hidden variable affects the output time series data, by 
setting the precursor as the hidden variable. 


The two state variables of the theory are the total 
numbers of neutron and precursor, since the thermal, 
mechanical, and hydraulic effects are negligible in a zero 
power reactor. There are mainly two kinds of approaches, 
such as the Kolmogorov and the Langevin formalisms. By 
means of the first collision probability method, Pål [35] 
and Bell [36] examined the branching process in the frame- 
work of probability distribution of backward Kolmogorov 
formalism. The forward Kolmogorov formalism was presented 
by Courant and Wallance [37] to understand neutron popula- 
tion dynamics. On the other hand, the Langevin method was 
suggested by Cohn [38] in reactor noise analysis, in which 
the Langevin equation is one of the fundamental equations 
used to describe an irreversible process, and one of 
important tools to combine a microscopic world with a 
macroscopic world. Although the branching process is not 
Gaussian mathematically, a Gaussian distribution was 
observed experimentally in zero power reactors. Hence, 
there is an ambiguity in both formulations. This ambiguity 
can be clarified [33] by using the framework of system 
size expansion, since a solution of the Fokker-Planck 
equation (6) is the Gaussian distribution (7) asymptoti- 
cally [7,32]. 


Let the zero power reactor be described by a point or 
lumped model with one group of delayed neutron precursors 


CONTRACTION OF INFORMATION AND ITS INVERSE PROBLEM 19 


under the assumption of Markov and stationary processes. 
Let P(N,C,t) be the probability distribution of N and C at 
time t, where N is the total number of neutrons and C is 
that of precursors. The time evolution of the probability 
distribution is described by a master equation with the 
transition probability. 


WIN;G AN ACIES 99,95 AG 0* ACE Ante AO, -1 


Z Sa ) ( 
+ È = — plos» 6 -1 » (11) 
vy g=0 v4=0 : 0 1 AN, vo 1 AC, vy 
where S is neutron source, A is the precursor decay 


constant, P(vg 44) is the probability of vg neutrons and 
vı precursors emitted in an absorption, ¢ is the neutron 
lifetime. Suppose that N=Qn, C=Qc, and S=Qs, then we can 
use the system size expansion method mentioned in the 
previous section. In this way, Eq. (9) becomes a 2-dimen- 
sional Langevin equation, (d/dt)& (t)=L& (t)+r(t), with 


1-k(1- 
ee (p-B)/A A 
Le t or (12) 
Bk/v ~à B/A -ì 


where k is the effective multiplication factor, p is the 
reactivity, and A is the generating time. 


To realize the essence of the contraction in noise 
sources, let us suppose that the observable variable is 
the neutron number and the precursor number is a hidden 
variable. Namely, the observation matrix is H=(1 0). In 
the discrete time representation we have 


y(n)=Hx(n)=(1 0)x(n). (13) 
The time evolution equation of observable variable y in 


Eqs. (10) and (13) is expressed by eliminating x(n) as a 
non-Markovian type of model, 





y(n)-tr®y(n-1)+det®y(n-2)=f](n)-#392f1(n-1)+ġ19f3(n-1) ’ 
(14) 


where ij is the (i.j) element of @:=exp(LAt) and the 2- 
dimensional noise vector f(n) is written as (f(n) 
fa(n))T. Since f(n) is not measurable, it is necessary to 
rewrite the noise term in measurable quantities. An inno- 
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vation is used for the equivalent representation of noise 
forces in the Gaussian process: 7 (n)=y(n)-E{y(n)/Y(n-1)}. 


Hence, the derivation from Eq. (14) to an ARMA model can 
be expressed by the projection of variables to past ob- 
servable time series data, Y(n-1): Taking the conditional 
average of Eq. (14), we have 
=P gvir triz 
y(n)-tr@®y(n-1)+detOy(n-2)=7 (n)+—————————— 7 (n-1), 
T (n-1) 
(15) 

where E{y(n)IY(n-1)}=y(n)-y(n) from the definition of the 
innovation, E{y (n-i) |Y(n-1)}=y(n-i) (i=1,2), and 


r (n-1):= f (y(n-1)-y(n-1in-2))?P(y(n-1) 1¥(n-2) )dy(n-1) is 
the variance of y(n-1). The first term on the rignt-hand 
side (the random noise terms) becomes zero: 
E{f,(n)1Y¥(n-1)}=0, since we can use the randomness of 
f,(m). The second term, f,(n-1), is projected as 


f,(n-1/n-1):=E{f(n-1) |Y(n-1)} 
=E{f (n-1)¥(n-1)"}[E{¥(n-1)¥(n-1)7}]~+¥(n-1) 
E{y(n-1)y(n-1)} meriad nd 

=(Vv11 O = 0) 

E{Y(n-2)y(n-1)} E{Y(n-2)Y(n-2)T} 

=v411 (n-1) "+L y(n-1) 
-E{y(n-1)¥(n-2)T}E{¥(n-2)¥(n-2)7}~1y(n-2) J 

=V11T (n-1) 71 y(n-1)-E{y(n-1) 1¥(n-2)}] 

=v,1P (n-1) 747 (n-1), (16) 


Y(n-2) 


where we used the conditional Gaussian property in (B.2) 
with white random forces, E{f,(n)¥(n)"}=(vyq 0 - 0), and 
the inversion block matrix formula (A.2). The third term, 
fo(n-1), is also expressed equivalently by the innovation 
y (n-1). The detailed derivation is reported by Kishida 
and Sasakawa [34]. 


Squaring and taking the ensemble average after 
equating both right hand sides of Eqs. (14) and (15), we 
obtain a time evolution equation of T (n) 


Ac 
T (n) +— Bos (17) 
T (n-1) 


with A= (-¢ 99V11*% 1212)? and Bo=(1+¢ $0)¥41*¢ foVo0 
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-2¢49¢902V49- The Riccati equation (17) becomes r2 
-BoT +A,=0 in the steady state. We have two steady solu- 
tions, T, and [_. The steady Riccati equation is also 
derived directly from Eq. (2.5). From Appendix C, a stable 
steady solution is given as 


r ,=(/ B2-4A+B,)/2, (18) 


which is one of solutions of r2-Bor +A,20. As to an un- 
stable steady solution [_ obtained reversely in the time 





evolution of Eq. (17), we have the relation 
F _=A,/T ,. (19) 
From Eq. (15) and the relation Eqs. (18) and (19), we 





have a backward time evolution equation for the observable 
variable y: 


~$99V11*%12%12 
y(n)-tr®y(n+1)+det@y(n+2)=7 (n) +—————_ 7 (n+1), 
p 


(20) 
and for a forward time evolution we obtain 


-$ 99V11*%12%12 
y(n)-trOy(n-1)+det®y(n-2)=7 (n)+———————-_ 7 (n-1). 
r 


+ 


(21) 


Since there remains a difficulty in generalization of the 
above procedure to a higher-dimensional case, we have 
examined the systematic derivation of the ARMA model, Eq. 
(2.9), via the innovation representation, Eqs. (2.3) and 
(2.4). 


Since we can take the algebraic approach as in Sec- 
tion III.1, it is easy to show that the ARMA model (21) 
will be also derived via the system matrix (2.10), 


$442-1-1 $4927} 1 
S(z-1)=| $9,271 $p9271-1 p12/P11 (22) 
1 0 0 i 
where P =HPHT=p]}, K=PHTT=(1 p]2/P11)", and pyj is the i,j 
element of P. Since H=(1. 0), the transformation matrices 
mentioned in Section II.1 are follows: 
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hT 
hf=(1 0), h]o=(%11 %12)» “| ‘te | and S=T~!, 


Then we have H'=HS=(1 0), K'=TK=(1 $1;+¢49P,0Pi+)!. 


0 1 1 0 Z 0 
®'= » R(z)= . and L(z)= = -1 
-det® tro Bl -(tr®)z°"4+1 277 j 


In this way we obtain 


L(z) 0 T 0 S 0 R(z) 0 0 1 Zz 
| lÍ jea l | |- 0 tra) 
o zrilo I 0 I 0 I 1 0 o0 ; 
(23) 


where a(z71)= =1-(tr®)z-l+(det®)z~2 and b(z71)=1+(¢19Py19 
-Ø 99P11)Piq2~ l. after the interchange of columns, we 


obtain the system matrix mentioned in Eq. (2.13). 
For the 2 dimensional case, we can rewrite Eq. (2.5) 
as 
P=qV+V, (24) 





2 2 
$ $ 
where 4q=Pg99- P12 » and v-| 12 #12%22 


A From Eq. 
2 
$12%22 $32 | 


(24) , we have $ 12P127 $ 22P117 $ 12¥127 $ 22¥11' and then an 
ARMA model via Eq. (23) is the same as Eq. (21). The 
details have been reported in papers [34,39]. Hence, the 





formalism developed in Section II can be applied to the 
analysis of power reactors with complicated reactions from 
the viewpoint of contraction of information. 


IV. APPLICATION TO AR MODEL ANALYSIS 


When a system is characterized by an AR (autoregres- 
Sive) process, the method of AR model analysis is a power- 
ful tool, since there are useful recursion algorithms for 
efficient computations [40,41] and popular information 
criterions such as FPE, AIC, and MDL [42-44]. For exam- 
ple, the FORTRAN program TIMSAC [2] is available. Hence, 
we can analyze stochastic properties of the system by 
using the noise power contribution ratio in addition to 
the conventional quantities such as partial coherence and 
frequency response functions. Even when a system is not an 
AR process, a power spectral density (PSD) is estimated 
successfully in the AR model identification, since the AR 
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model is a kind of power series. The AR model analysis has 
been applied in practical cases [10-18]. 


In AR modeling, we observe a system through a filter 
or "AR lens" in general. We should investigate results 
obtained from the AR model analysis. Otherwise, we make a 
mistake in identification or diagnosis by the distortion 
through the AR lens. Though some original properties are 
retained by the approximation of an AR model, some other 
properties can not be restored to the original, as long as 
it is an approximation. In AR modeling, PSD is one of the 
former properties, and agrees with that of the FFT or 
Blackman-Tukey method. Though the method of AR modeling 
has been examined in many papers, there are few papers 
which state the latter properties. Hence, it is important 
to recognize what is lost in the AR model analysis when we 
want to apply the AR model to reactor diagnosis. 





Along with the direction mentioned above we will 
examine properties of AR models in system identification. 
First of all, let us assume that reactor noise phenomena 
are described by Eqs. (2.1) and (2.2) in the discrete time 
representation. Owing to the contraction of information of 
the observation, observable correlation functions are 
related statistically to the ARMA model (2.9) as in Sec- 
tion ‘ 














A(z~1)y(n)=B(z71)7 (n) or y(n)=A(z71)~1B(271) 7 (n). (1) 


It should be noted that the ARMA model is determined as an 
equivalent representation corresponding to the observation 
H. If we have the other observation in the same system, 
then an ARMA model for the observation is different from 
Eq. (1). Let us examine properties of an AR model ina 
process described by the ARMA model (1). That is, let an 
AR model of order m be defined by 


w(m,z71)y(n)=e,,(n), (2) 
where W(m,z-1)=1+6 427 1+... +ø pz ™, Em(n) are residuals 


and {¢,} are coefficient matrices of the AR model. AR 
poles are roots of the characteristic equation 


AR poles: det¥ (m,z~1)=0. (3) 
In the AR model, zeros of the ARMA model (1) must be 


contracted and expressed by a group of AR poles, since the 
AR model fitting the ARMA process is all pole-type. From 
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the relation 


k-1 L 
Sai E T 


Heej) 


where w ;=exp(i2z j/k), the operation of zero-elimination 
in the dr modeling can be evaluated approximately by that 
of circular pole-creation of AR-type model with large 
model order. This may be interpreted symbolically as {one 
real zero of MA part}={a group of circular poles of AR 
part} [45]. The elimination of ARMA zeros is also the 
contraction of information. In the contraction, we have 
two rules for an asymptotic AR-type model. In Section 
Iv. 1, one is "the pole location rule of AR-type model" as 
in Fig. 3. In Section IV.2 the other is "the separation 
rule of AR-type poles" as in Fig. 4. Since there are few 
papers that examine these properties of AR model, then we 
will introduce the above rules in the case where a single 
variable is observable, q=1, since the general theory of 
the multivariate case has been reported in a series of 
papers [46-52]. A theory of AR-type model with q>1 is 
omitted. 


IV.1. The Pole Location Rule of AR-Type Model 





Let a scalar ARMA (d,d-1) process (1) be described by 
-1 K -1 
alz ` >)y(n)=b(z27>)y (n), (4) 
where d 
a(z"}):=det (I-27 )=142 ayz™? 
I-oz! K 


d-1 -1 
=1+ Ð byz à 
-H 0 i=1 


ba) sor] 


and E{7 (n)7 (n')}=P 6 yy! with Pr=g2. without loss of 
generality poles and zeros of the ARMA model (4) are 
assumed to satisfy relations 


lajtisiasiis. . .stag-jiisiagti 

LAG lela les « «Zl taa stid- i (5) 
=1 =1 “1 -i -1 

| Islas is. . .sla I<laqri<la l, 

ay lslag is al Pig se 1 tjt! 


d .d-1 
where alzi) N (1-a y27>)- and b(z7})= II (1-142771). Here- 


after, we call a closed curve izZisiazti the conver- 
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gence circle. Then, there ar fin System poles inside the 
convergence circle, since system poles, which are inverse 
of eigenvalues of ® in the zl plane, correspond to ARMA 
poles from Eq. (2.16). 





Consider a scalar AR model of degree m defined by 


g(m,z~t)y(n)=e,(n), (6) 
where ¢$(m,z71)=1+ 3 am (i)272 and Ef{e,(n)ée,_(n')} 
=0 26 nn’ . Poles of the AR model are roots of ¢(m,z~+)=0 


in the z 1-plane. Let the number of time-series data be 
assumed to be so large that the statistical error of the 
correlation function is negligible. However, there still 
remains a bias between the ARMA model (4) and a fitted AR 
model (6). From the viewpoint of system identification or 
diagnosis, it is important to know dynamic properties of 
the fitted AR model, of which the coefficients are deter- 
mined from the normal equation, owing to the minimum 
prediction error. Coefficients of the AR model are related 
recursively by nonlinear transformations [40,41], and we 
cannot know the properties of AR poles analytically. 
Therefore we will introduce an analytical model like the 
AR model. 


From the minimum phase property of the innovation 
model as mentioned in Section II.1, zeros of the ARMA 
model (4), Aa (14€1; L:=1, 2, , d-=1) are invertible; 


min 1a ļti>1 in the z`1 plane (cf. b(Aj+)=0). The rational 


function a(z~1)/b(271) can be expanded into a Taylor 
series in the region of |z~+\<minl aj. 
iI 


S -1 
(1+ Ł cįz`™>)y(n)=7 (n). 

i=1 
If we truncate the series at order m, then an AR-type 
process, 

ta(zt)y(n)=v p(n), (7) 
has a colored noise Y mín). where 
m 1 a(z71) 


tm(z-+)=1+ >» eyzt and ¢y=—(——) (J) 
i=1 j! b(z7) z~1=0. 
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For the asymptotic behavior, we have limE{v mín)» m(n "J3 
m- œ 


=O 25 nn‘: If » mín) in Eq. (7) is replaced by a white 
noise y(n), we have a new process called the TAR (trun- 
cated AR-type) model: 


tm(z t)y(n)=7 (n). (8) 





Poles of the TAR model are roots of the equation in the 
z 1-plane: 


TAR poles: t_(z~+)=0. (9) 


From (8), the poles of the TAR model mainly depend on 
b(z7t). To examine pole properties of TAR model (8) ana- 
lytically, it is necessary to discuss the following Lemma. 


Lemma 1: Let g(z7t) and h(z71) be polynomials of 
degrees k and ¢. Let gj! (i€J; J:=1, 2, ..., 4) be zeros 
of h(z~1)s0. A rational function of zl, 
n(z7+):2¢(z7!)/n(z71), is expanded into the Taylor series 
in the region of Iz“ i<minl Gt: n (2-1)= E 6,273, Then, 
a polynomial defined by the first (m+1) terms of n(z7t) 
is transformed as 


-1 m -i 
ge oE 3: bas 
m 120 i 





g(z71) 1 «¿ ugeceqty (ey) tl 
ae -1)m+1 
2 Eja <) 
h(z7Í) detW i=1  1-#;įz`1 


where uy=(adjw);, and W is the Vandermonde matrix; 











1 1 ae Sexxy kE 
Fy Fo ah Gata f, 

W= E2 2... g? 
1 2 t 
ac ae pj 
Fy Fo a sf; 

k . 
Proof: If we put g(zt)= F ua“ and h(z7}) 
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t 
=1+2 h,z7} then we have a recursion relation 
i=l 
l 
j=1 


Using notations of matrix and vectors 


oO], pea o 


= BE g 0 :=| 6 and @ =|: 
l o O 1 i |” i-:+2 i Jo 
: ei , 
-h, -h 6 # » ell 6; 
l e-1 1 1 for isk 
Eq. (10) is rewritten as 
8 5=Q0;_1+0 i (i=1, Zs eee, k) 
i=7Q0 i -1 (isk+1, k+2, ...,m). (11) 


Suppose that Fitt; for i#j, then there is the Vandermonde 
matrix W such that a=-wawl where A=diag(&4, €9,..+.€ t Ja 


Substituting these relations into Eq. (11), we have 
i 
pagar swe 5 (i906; dy acce HS 
6,=1°k (12) 
i peat wie j (i=k+1, k+2, ..., m). 


Tf we put uy=(adjW);, , Eq. (12) is expressed in terms of 
components as 








ak i t 
64= Senger Gt) ke sae B 
detW j=0 n=1 
o R 4 (13) 
TE E Sem, F hi) (isk+1, k+2, ..., m). 


detW j=0 n=1 


From Eq. (13) we have 





1 k l n 
0 (z-+)= {> = 2 pgr rey 
detW j=0 n=1 i=j 
1 k t m- 
= {I 3 giuté tents (eye 





detW j=0 n=1 i=0 

1 £ Unty -1 
z ————ieg(z!)-(pzD*lg( g1} 
detW n=1 1-ë pz“! 
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1 
=——— {q(z7!)g(271)-r, (274) (272) My, (14) 
h(z7) 


In the last equality we use a(z7+) and rata +) defined by 
hizt) « ugg t 


z 
detW n=1 1-¥pz“! 





atz tgi- 


h(z71). u,F i - 
ralz te= z ——— High). 
detW n=1 1-Ẹ pz“! 





In the region of j2-li<mini jit, we obtain 
i€J 
g(z7}) 
lim nm(z-t)=———— and Lim rp_(z71)(z71)™*1e0, 
m— æ hiz =) m= œ 


From above relations and the definition of ņ, we have 
lim npm(z7}h(z })=n (27 )n(z74) 
m œ 
a(z~1)g(z71)=e(z74). 


Therefore we have q(z7+)=1. Q.E.D. 


From Lemma 1, we can rewrite the TAR model (8) to 








a(z71) 1 d-1 u,a(a7tjad2 
ty(z7t)= - 2z ——— (2,771 ™ 1 (15) 
b(z7])  detS:=1 1-2 ,27} 
where 
i @ ceg i 
Ay Ao . . . à d-1 
s=] AZ Ag cow Afa 
d-2 ,d-2 d-2 
Aye aS 2422), 


and u, =(adjS) , d-1° 


In the case of Eq.(5), Eq. (15) can be transformed 
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into 
b(z"*)t,,(27+)ea(z71)-e(27*), (16) 
where 
1 d-1 
e(z7+)= ew, af Zai ie, Wi, 2, 
detS :=1 


with b, (z71):=p(z71)s(1-a ons The degree of the poly- 
nomial a(z *) is d, and that of e(z~1) varies with the 
order of the TAR model. On some closed domain in the 27+ 
plane it can be evaluated that a(z~t) is finite and that 
the absolute value of e(z7t) decreases to zero as m>, 

Using the difference on the TAR model order dependence 
between a(z“) and e(z7t), we can determine the number of 
roots of bia" ttet t) in a suitable closed domain. By 
using the Rouché Theorem, the asymptotic nature of pole 
location of TAR model can be stated in three following 
Theorems: 


Theorem 1: For any real number €, e<l1-lA lla ls 

in 
there exists some positive integer Mg such that if m>Mg » 
the TAR model of order m has ¢t4y poles in the region of 
tant 1<ha i \(1-e). And ty, poles of the TAR model con- 


verge to system poles aï! (i=1, 2, ..., Lip) as m becomes 
large. 


Proof: Let Ca{zt zt =laqli(i-e)}. then there is a 
positive finite number u(e ) such that 
min{|a(z~1)}}>u(e)>0, since a(zl) has no roots on C and 
z tec 
is analytic inside C. For (le) there is a positive inte- 
ger Mg such that if m>Mg . ule y>le(zt)1, since JA 2 1x1 
(2€1I). From the Rouché theorem the number of roots of 
b(z~+)t,,(z-1) <0 inside C is fin if m>mg. Since there is 
no root of b(z7~1)=0 inside Cy there are ¢ ip roots mei 
tm(Z l)=0. Let Cy={z7* 112 l-a} l=e4} for 0<e es aha l- 


lagi, taĵjt-a3ti) (i,j=1, 2, ..., tin). Similarly it can 


be shown that “each root of tm(z`1)=0 inside Cy converges 
to the system poles Ga | (isl, 2, rs» tip) as 
m>. 


In the region outside the convergence circle the pole 
location of the TAR model is classified into two cases; 
Theorems 2 and 3. 
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Theorem 2: Suppose that a Ta is real and 
bazli<iagti. For 0<e<l-là}llæ , 741, there is a positive 
1 2 1 tin 


integer Mg such that if M>Mg, the TAR model with order m 
has, m-:¿ in oles in the circular region OË 
12 Jtt -e)<iz7ti<ia jt] (+e). 





Proof: Let C! be a closed circle: For 
O<e'<JAqllagri-1, C'e{z-tyizieiagti(iee')}. For a 
positive integer mo there is a finite positive number 
ule ",mp) such that if m>mp, 





1 d-1 
max{|a(z~+)- M u, a d-2a(171)b, (zl) (a ,27tym+1y) 
z-lec' dets: =2 
<u (e',mg), 
since la, ztict (2.22, 3, ..., d-1) on c' and a(z71) is 
analytic on C' . For uw(e',Mg), there is a positive number 


Mg such that if m>mg>mg, 





min {1- uy 29> 2a a jt), (272) (a 27 F811) 
zlec' dets 


>u (e'm), 


since [2 gzip. FON the Rouché Theorem the number of 
roots of b(z27 lita (z7 1)= 0 is m+1 inside C' for m>mg. The 
number of roots of ty EZ 1)=0 is m inside C', since 
b(z” 1). =0 has one root rit inside C'. Stee the degree of 
ita) is m, there is no root of ty(z™ 1)=0 outside c'. 
From Theorem 1, ty (2~ l)= 0 has tin roots inside C. Then, 
there are M-tj4y wookS of ty(z~ 1)=6 in the circular region 
between C and C'. 


Theorem 3: Suppose that agi and a3 are the complex 
conjugate and bagiislagti<iag |. For O<e<i-laylla, stl, 
n 


there is a positive integer Mg such that if mM>Mg, the TAR 
model has m-2z in noe M-¢4y)-1 poles in the circular region, 
lài lI(1-e)<Iz Lielig I (l+). In cases where the TAR 
model has M-¢4y-1 poles in the circular region, the TAR 
mogel nag a single real pole in the outer region of 
Iz bisia }(lte). 





Proof: Let Cr be a closed circle: For 
O<e"<JAq11AgtI-1, c"={z7l]įz°]]=łà J} (1+e")}. Since 
alz“ 1) is analytic Jñside c" and [a Z Pai (083, 4S. aa 
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d-1), there is a finite number v(e",mg) for a positive 
integer mg such that if m>mMy » 





1 d-1 
max{ |a(z~1)- í u, ii Zata ie, (Ait , 2°)" 
z-tec" detS. =3 
<v(e",mg)- 


For v(e ",mo) > there is a positive integer Mg such that 





1 2 
min{| A. u, 22 -Aath ib, to 0a, a eL 
zlec" detS :=1 
>v(e",mg), i 
where M>Mo>my. If we put PRE E e dr PRD M SP a and 
Nl (Ay-Ag)=eexp(i¢). where A,=1A,lexp(i¢) and 


i=3 
Ao=lAzlexp(-i¢), then we have 








1 2 
E n, ao Zata ib, ha 2 ae 
detS :=1 
2iu d-1 d 
= m(a-a,27typ 124197 n ayiryitsin{ (m+d-i-1) ¢+6} 
dets: =3 i=0 
d 


- O ajlàņglİsin{(m+d-i-2)ġ+ø}(1à41z7})] (12; 1z71™*l, 
i=0 


where u=(-1)¢-1 II (Agr-Ay)- 
d-12i>j23 
From the above relation, let us define a real series Zm: 


d 
2 ajlàqlÏsin{(m+d-i-1)ġ+¢)} 
i=0 


Zm” d , (17) 
Jaq) ett! 1l sini (med-1-2) 6 +8) 





and assume that polynomials biz ttai t can be classi- 
fied into two cases; 


My(m,e")={b(z-+)ty(z-+)lizgistaqii+e "/2)}, 
or 
Mo(m,e")={b(z-1)ty(z-1) | iaglziaqil+se "/2}. 


In Mj(m,e"), Zp» is inside C", and in Mo(m, £"), Zi is 
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outside C". In the case wher b(z“`l)t m(271)EM; (m, & 5 
there are m+2 roots of the polynomial biz’ )ty (z +) inside 
cC" for m>mo from the Rouchė Theorem. since roots of 
b(z“1)= =0 inside C" are X31 and 231, there are m roots of 
tniz 1). 0 inside C". Therefore there exist M-tin poles of 
te (27 l) in the e region between C and C". In the 
case where ie Pt ‘dl ) E€ Mo(m,e"), there are also m+1 
roots of b(z ‘tm fa 1). 0 inside C" for m>Mg - There are m-1 
roots of tm (27 ya 0 inside C". In the outer region of C" 
there is a meine real root of tm (z~1)= =0, since the number 
of total roots is m. This pole moe the TAR model outside 
the convergence circle is called a "non-robust singular" 
pole. Therefore there exist M-¢4,-1 poles of efx tj in 
the circular region between C and C", and a single real 
pole in the outer region of C". In this discussion, it is 
assumed that Zm exists outside of the circular 


region defined by Lagi ie" s2gl27 ipsia 1j+3e"/2. Q.E.D. 


Though we hav xamined the pole location of the TAR 
model, we obtained [46-50] the same asymptotic results in 
the AR model analysis numerically as in Fig.3. Therefore 
the TAR model is an analytical substitute for an AR model 
or a representative of an AR-type model for the analytical 
study of such models. The Pole location rule for AR-type 
models have already been reported [46-50] as asymptotic 
properties of vector AR-type models with large order: 





(P-1) Some poles of the AR-type model are system poles 
inside a convergence circle, which is a circle centered on 
the origin with a radius of the absolute value of the ARMA 
zero(s) closest to the unit circle. 


(P-2) There are multiple-circular AR poles almost equally 
spaced in angle on a circle, which represent equivalently 
some ARMA zeros. Since q=1 in the above case, single 
circular AR poles appear. In the case of q observable 
variables, q-multiple circular AR poles usually appear. 


(P-3) There are robust and/or non-robust singular AR poles 
outside the convergence circle. 


The structure of circular AR-type poles guarantees that 
the AR-type model with different order can describe the 
same system with fixed poles. Effects of the other system 
poles and ARMA zeros are expressed by distortions of the 
circular AR poles as in Fig.3. 
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IV.2 Weights of AR Poles [51] 








Since an AR model is a linear equation, it is import- 
ant to examine properties of its eigenvalues and eigenvec- 
tors mathematically. If poles of an AR model correspond to 
eigenvalues, then weights of AR pole become eigenvectors. 
Since the asymptotic properties of pole location of AR 
models have been made clear as in Section IV. 1, we will 
examine properties of weights. In order to proceed ana- 
lytically, the following properties of an ARMA (d,1) 
process are assumed to take advantage of the asymptotic 
pole location rule of AR-type models: 

(1) All ARMA poles are located inside the convergence 
circle. 

(2) The ARMA zero nearest to the unit circle is single and 
real. 

(3) There are no robust and/or non-robust singular poles. 

Let a scalar ARMA (d,1) model under these assumptions be 

expressed by 


d 
M(1-a@ y27+)y(n)=(1-2271) 7 (n), (18) 
j=l 
where a> (j=1, 2, ..., dG) are system (ARMA) poles and 


a7l is an ARMA zero. Weights of poles {85} of the AR 
model o(m,z7+), Eq. (6), are defined b Wy: 





1 m Wy 
ae i oe, (19) 
o(m,z-1) j=1 1-8 jz“! 


Though coefficients of the AR model are determined 
from the fast recursion algorithm, we cannot treat the AR 
model analytically. So, we also analyze a TAR model as in 
Section IV.1. That is, from Eq. (18), 





e(z71)-l;=B(z71)7la(z7l)= £ tjz`i, (20) 
j=0 
1 
where ty=—G(z-1) 10) -1 the TAR of order m is 
Jj! z =0, 


defined by truncating at the mth power of z7l, Further- 
more, the TAR-type model can be approximately expressed 
as, for large m, 
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TARmsg-1(Z + )¥(n)=7 (n), (21) 
d m-1 
where TARpyg-3(z-1)=M(1-a@yz74) £ (az4)J. The last 
jel j=0 
polynomial of the TAR model can be written 
m-1 m-1 
z(az“ljj= m(1-» yz) 
j=0 j=1 
where yp =o], w=exp(2z i/m). In this way, it is easily 
seen that the zero 4A + is equivalent to a group of circu- 
lar poles vit, j=l, 2, ..., m-l. Then, we can decompose 
the transfer’ function of Eq. (21) into a partial fraction: 
1 d Aj m-1 By 
= ooo + Z} —, (22) 


TARm+q-1(Z7}) j=l 1-ajz™l j=1 1-vjz“? 


Weights A; and Bj correspond to the system pole aj} and 
the circular pole glad, respectively. 


Theorem 4: When the TAR model order is sufficiently 
large, weights Aj and Bj are evaluated as 





1-f4 1-wJ 
Aj= a , (23) and aaa” aaa 
-ym 5 -1 5 -1 
(1 e Le Kaj ) ma (1 ayy ) 
kžj 
Proof: From Eq. (22), the weight Aj of the system 


pole at is given by 


Ay =(1-a y271) TAR g-1 (27171 ae oe 
Zz =a 4 
1 





[= 


= m-1 Skp s ` 
(-apa jt) (1-2@ aj?) 


k=1 
k# Jj 


w 





By using the identity 


m-1 m-1 
N(z-ok)= 2 zk, (25) 
k=1 k=0 
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and setting Cyshaji, we have 





m-1 m-1 m-1 
n (1-aw-Kajty=- N (2075) n {-ajt+(aw*)-t) 
k=1 k=1 k=1 a 
m- m-1 m-1 ¢ My 
=(-a)7M 2 (awk) Wg y-wkye z rie, 
k=1 k=1 k=0 Egat 


Then we have Eq. (23). From 


since w is the mth root of 1. 
the weight Bj of one of the circular poles, 








Eq. (22), 
pgto is also evaluated by 
-1 -1 
B,=(1- Z TAR az: S) 
‘ieee ae m+d-1 sia 
i. 
1 
a m-1 ' 
T (1- 71) M (1-wJ7K) 
gay Te pS 
k#j 
Since we have relations 
I (1-05-K) m-1 
k=1 N (1-w) 
m-1 j-k k#j k=1 
n (1-w )= no Se 
k=1 1-w4 1-wJ 
k#j 
the weight Bj is given by 
(1-wJ) 1 
o —, 
m 


By= 
d ay 
m ee RY ) 


where we have used the relationship obtained from Eq. 


(25), 
m-1 k 
T (1-W¥)=m. Q.E.D. 


On the other hand, weights of the ARMA poles inside 
the convergence circle are also evaluated from Eq. (18) or 


(20) as 


l-a ;z7})G(z71)7t 
Aay zlaga a 


CONTRACTION OF INFORMATION AND ITS INVERSE PROBLEM 37 


Comparing Eq. (26) with Eq. (23), we find that the dif- 
ference between both weights of system pole is (a-¢R)-1, 
When the model order m is large, weights of system pole of 
an AR-type model converge to those of the original system 
poles since the system poles are assumed to be inside the 
convergence circle: beMi<i. 


Then we can summarize properties of weights of AR- 
type model with a large model order as the separation rule 
of AR poles: 


(W-1) A pole of AR-type model is one of system poles 
inside the convergence circle if its weight is constant 
for model order change. 


(W-2) A pole of AR-type model is one of circular poles, 
if its weight is inversely proportional to the model 
order. 


It can be concluded that this property of weights of AR- 
type model is used for separation of a system pole from 
circular poles as in Fig. 4. The asymptotic weight rule of 
vector AR-type model has been discussed in papers [51,52]. 


V. APPLICATION TO SYSTEM IDENTIFICATION IN FEEDBACK 
SYSTEMS [53] 


Val The Forward Problem 





A representation of a stochastic feedback system with 
open loop transfer functions is defined by 


m 
nee a Fy y(z7 bys (mn) +Fy (274) fy (n). (1) 
j#Ai Piety 2s. eee vim) 


where Fy (z71) ana F,(z71) are open loop transfer function 
matrices, y;,(m) are output vectors, and fi(n} are physi- 
cal random input vectors. In the present section the 
feedback system is assumed to be a square system. That 
is, the dimension of yy(n) is equal to that of fi(n). 
dim y,=dim fy=qy (qyeZ). It is assumed that the random 
vectors are Gaussian and white, E{f,;(n)}=0 and 
Effi (n) f(n’) TVi jonny where Vij is a ayX4 j variance 
matrix, Vii" Yije The stochastic process described by Eq. 
(1) is a linear Gaussian stationary process. For example, 
the BWR type simulation model treated in the benchmark 
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test [20] of SMORN-V has m=3 and q}=G9=q3=1. 


Without loss of generality, let us set m=2 hereafter 
for the simple treatment as in Fig. 5: 


¥,(m)=Fy9(z7!)yg(n)+Fy (271) fy (n) 
Yo(n)=Fo1(z-1)y4(n)+Fo(z-+)f9(n), (2) 


Om mr ae aes aes as am ÇUNI as at as as a f s (n) 
\ \ \ \ 

a -1 
F i2(z ) F (z pp EREN 
brans ens ene aro awe ane ab brane aw ame are are abs 






y 2(n) 


rama as as an Oe as a) at at a AA 
N 


ne r.z] Fariz") ya(n) 


f aln)  Mermrmcmermeraeh N 


Figure 5. 4-block feedback system 
We call Eq. (2) the "4-block" feedback system with open 
loop transfer functions F,(z7+), Folz th, Fj2(271), and 
Fə; (z`}). Let open loop transfer functions be assumed that 


lim F} (z71) and lim Fy(z 1) are finite, 


z7l—0 z710 (3) 
Lim Fy9(z71) and Lim F1 (z71) are zero, 
z--0 Zz -—-0 


and that irreducible and minimal realizations of transfer 
functions are 


Fy 9(z~+)=Cy9(I-Ayo27+)~ 1B, 9274 

Fo, (271)=Coq (I-Agyz71)~1By127+ (4) 
F,(z7+)=C, (1-Ayz>1)~1B, 
Fo(z~1)=Cg(I-Agz~1)~1By, 


where Ajj Aj,» Bij: B;, Cįij and Cy; are matrices with 
suitable sizes. We can rewrite the feedback system (2) in 
the state space representation: 


X49 (N)=Aq 9X49 (N-1)+By 9C91X9q (N-1)+By 9Cox9(n-1) 
xı (Nn) =A, xX, (n-1) +B, f(n) 
X91 (N) =Bg1Cq 9X4 9(n-T)+Bg,C; xX, (n-1) +Ag1 X94 (n-1) 
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X(N) =AyX9(n-1)+Bofg(n) (5) 
¥ (mn) =Cy 9X1 9(n) +Cyxq (n) 
Yo (N) =Co4X9q (MN) +CoXg(n) ‘ 


In a compact form we have a state space model 


x(n)=®x(n-1)+Gf(n) 


y(n)=Hx(n), (6) 
where x(n):=(xXz9(n)7, x,(n)7, X91 (n)7, x9(n)T)T is a d- 
dimensional state vector, y(n):=(y1(n)T, ya(n)T)T is a q- 
dimensional measurable vector (q=q)+q9), f(n):=(f,(n)?, 


fo(n)T)T is aq-dimensional white Gaussian random vector 
with E{f(n)}=0 and E{f(n)f(n')T}=Véo pn. and matrices 0, 
G, H and V are defined as 


Ay2 9 Bylo) B12C2 0 0 
g 0 A 0 0 cei By © 
Bair PCi Agr 9 o W 
0 0 Ao | 0 Bol, 
H= | 12 ©, 0 0 and V= Vila Vi2 
© © Gz zj viz Yaj 
The state space model (6) is a starting equation as in 
SectionII, and matrices ®, HG, and V are assumed to be 
nonsingular. 


All eigenvalues of ® of stable feedback system lie 
inside the unit circle in the complex plane, since the 
feedback system is stable. From papers [24,54], poles and 
zeros of the square system with the feedback mechanism are 
defined as roots of following characteristic equations, 
respectively. 


System poles: det(I-0z~1)=0 


System zeros: det S(z~1)=0, (7) 
where 
_1,_]z 71-1 G oz-1-1 oGz7} 
S(z>*)= > 
H 0 H HG 


The last equivalence holds in the second type of represen- 
tation as in Section II.1. From Eq. (7), system zeros in 
the 271 plane are (d-q) finite zeros and q infinite zeros 
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under the controllability and observability conditions of 
D, Hand G [25]. For simplicity, we assume hereafter that 
all finite poles and zeros are distinct. Since HG is a 
nonsingular matrix, system zeros are determined from the 
characteristic equation (7): 


detS(z~1)=det (HG) det{[1I-G(HG)~1H)z71!-1}=0. (8) 


From Eq. (8), system zeros in the z 1 plane are reciprocal 
numbers of eigenvalues of ® [1-G(HG) 714}. It should be 
noted that all system poles in the z * plane lie outside 
the unit circle in the complex plane from the assumption 
of stability. However, the system zeros distribute in the 
total plane. 


By eliminating x(n) in Eq. (6), the state space model 
can be transformed into a closed transfer function form, 


y(n)=Gp(z-!)f(n), (9) 


where Gp(z71):=H(1-@z71) "16 is a closed loop transfer 
function matrix. It is well-known that the McMillan degree 
is the minimum number of the state variables in the state 
space realization of the transfer function. On Eq. (9) we 
need a suitable condition under which the sum of the 
McMillan degrees of four subsystems, Fjalz t), F31 (z771), 
F,(z71) and Fo(z-+), is equal to the McMillan degree of 
Gelz t). If the condition is satisfied, then the minimal 
realization of the four subsystems interconnected as in 
Eq. (2) is a minimal realization of the overall transfer 
characteristics from f(n) to y(n). Otherwise cancellation 
of pole and zero inevitably occurs in the process of 
reducing the feedback structure into Gp(z71), and hence 
the interconnected system is not the most compact repre- 
sentation of the overall transfer function of the system, 
even if the most compact representation is used for each 
of the four subsystems. The condition in the case of q=1 
has already been examined in detail by Kishida and Suda 
[55]. Thus, it is assumed that there is no cancellation of 
pole and zero in Gp(z}). 


In the reminder of this section, all open loop trans- 
fer function matrices are assumed to be the same-size 
square matrices, i.e., dim y,=dim fy=q/2, for easy de- 
scription of poles and zeros of open loop transfer func- 
tion matrices. Poles and zeros of each open loop transfer 
function matrix are defined by roots of the equations: 
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Zeros of transfer function matrices: 
njj(z7})=0 or nį(z7})=0 

Poles of transfer function matrices: 
dįj(z7})=0 or dį(z7})=0, 


where ny y (271) /dyy(z7+)=detFyy(z71), my(z74)/dy(27}) 
=detF,;(z~+) and each d pom na er and umra o are co- 
prime. If we put Welz tja z * )=det[I- Fig(z” pe te") I. 
then system zeros are defined from roots of 
de pze l)n (z7 *)ng(2- lye =0 and system poles are also given by 
roots of ng(2!)dy(27 "8262"? “ys 0, since detGp(z~+) 
=det[I- Fio(z” ijp (z? 1))-1 detF,(z™ 1) detFa(z7!). 


In the 4-block feedback system, there is a one-to-one 
correspondence between open and closed loop transfer 
function matrices. Furthermore, the open loop transfer 
function matrices are uniquely expressible in terms of 
closed loop transfer function matrices [56-60] as 


Fyo(z- 5 =649(z-1)Gy9(z71) 71 

F21(271)=G24(27})G]; (272) 7} 

F, (271) =G1 4 (z71)-Gy9(27!)Gg9(z71)-165, (271) (10) 
Fo(z~1)=Gy9(z~+)-Goy (2z71)6y4 (2-1) -46,9(271). 


Inversely, 


Girlz i =(I- “Pygla7 Fan te) Fa (271) 
Goo(271)=(1-Foy (271) Fy9(z71)) “Fg (271) 
Gyo(271)=(1-Fy9(2-1) Foy (274)) Fy g(z7t)Fo(z-1) (11) 
Go, (27+) =(1-Foy (2-4) Fy 9(z71)) “Fay (22) Fy (z271). 


From the equivalent representation of measurable 
correlation functions and Eqs. (2.3) and (2.4) in Section 
II.1, the innovation model in the feedback model (6) is 
determined as 





x(nin)=®x(n-1/n-1)+Ky7 (n) 


y(n)=Hx(nj{n), (12) 
where x(n|m)=E{x(n)|¥(m)}, Y(m):=(y(m)T, y(m-1)", oe hs 
7(n)=y(n)-y(nin-1), T:=E{7(n)7 (n)T}=HPHT, rank” =q, 
K:=PH*r™*, and P is a stable solution of an algebraic 


Riccati equation, 


P=0 {P-PH'r ~lyp}oT+ever. (13) 
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As mentioned in Eqs. (2.6) and (2.8), measurable correla- 
tion functions can be expressed by two types of represen- 
tation, 


Cyy(n) =ECy(n)¥(0)T}:=HOMC,,(0)HT=HO™ EOKKrKT (OT) KET, 
k=0 (14) 
[cf. Eq. (6)] (ef. Eq. (12)] 


where Cy, (0) is a steady solution of the Einstein relation 
Cyy(0)=OCy,(0)@T+GVGT. From the stability of the innova- 
tion model (12) all eigenvalues of (I-KH) ® lie inside the 
unit circle in the complex plane as mentioned in Eq. 
(2.21): 


leig{(I-KH)®}/<1. (15) 
This means that the innovation model (12) is of minimum- 
phase, since zeros of Eq. (12) are roots of Eq. (2.17) or 


(2.22) and eig{®(I-KH) }=eig{(I-KH)®}. 


V.2 The Inverse Problem 





From Eq. (2.29), three coefficient matrices of inno- 
vation model (12), ®, K, and H, are determined as A, B, 
and C numerically via the method of singular value decom- 
position of Hankel matrix, of which elements are measur- 
able correlation functions. Then, a data oriented or 
numerical innovation model is given by 


x(n|n)=Ax(n-1|n-1)+By (n) 
y(n)=Cx(nIn), (16) 


where A=T~10T, B=T7!K, and C=HT with a transformation 
matrix T in comparison with Eq. (12). From Eq. (16) a 
numerical closed loop transfer function model is obtained 
uniquely as 


y(n)=G67 (z71)7(n), (17) 


where G7 (z71):=c(1-Az71)71B and the superscript y denotes 
that a quantity is calculated from the numerical innova- 
tion model (16). We should pay attention to that there is 
no T-dependence in the expression of G7 (271) or G7 (z71) 
is invariant for the T-transformation, since G7 (z71) 
=H(I-@z~1)"1K. The difference between Gp(z-+) and G7 (z71) 
is that between K and G, since the original closed loop 
transfer function matrix is Gp(z~+)=H(I-oz71) 16. 
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If P_is ged to be a block diagonal matrix, then 
from K= PH! (HPHT)-1 we can have the same type of relation- 
ship as the original transformation formula (10) and (11) 
between closed and open loop transfer geet ton matrices: 
Since H and ® are common between =" 1) and G7 sA ly), no 
difference is found in both Fy9(z™ ly and Fo, (z7 l) under 
the condition that the equivalent feedback loop has the 
same type of structure as the original one. In this case 
we can estimate in principle the original transfer func- 
tion matrices via the transformation formula (10) and 
(11): F]g(Z71)=F]3 (271) and Fa} (z71)=F37 (271). Otherwise, 
the other condition is needed for identification of both 
original transfer function matrices Fyg(z~) and F94 (Zz7>). 


In the other case where P is not block diagonal 
matrix, the equivalent model has new additional feedback 
loops different from the original system. From Eq. (17) 
there is an equivalent 6-block feedback model with a 
different feedback loop structure, 


y(n) =F{9(z7*)yo(n)+F{ (z-1)7 1 (n) +FZ (271) 7 9(n) 
¥o(n)=FS,(z7t)y, (n)+F (271) 7 4 (n) +F3 (271) 7 g(n), (18) 


A difference between Gpl2z- 1) and G7 (z` 1) has effects upon 
not only F{ (z7 ly and F3 (z7 l) but also appearance of new 
additional noise loops F4 (z7 ly and FX (z7 l} according to 
the structure of P. The appearance of additional loops 
leads to irremovable difficulties in identification of 
Fi9(z~*) and Fo, (271). So, we must examine structures of 
P. Conditions needed for the identification of open loop 
transfer functions will be introduced in detail. 


MeZ sl Identifiability 





Since H and ® are common Ree tee Gr A 1) ang 
G7 (z7 l), no difference is found in Fio(Z 1) ae F94 (Z7 1) 
under the condition that the equivalent feedback loops are 
the same type of structure, 4 blocks of transfer func- 
tions. Under the condition, we can obtain the original 
open loop transfer function matrices from G7 (z71) of a 
numerical innovation model by using the transformation 
formula (10): 


Fį2(271)=F]3 (z7}) and Fo)(27!)=Fof (27}). (19) 


After matrix calculations of F{(z7}) and F}(27 from 
closed loop transfer function matrix G7 (z™“*) e o to 
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Eq. (10), we have 


Ff (2-1) =F, (z71)m, (274) 
F3 (z~1)=Fo(z7+)mg(z74), (20) 


where mq (Z7 1) and mo( z` 1) are suitable matrices. This 
means that F (z7 1) ar F(z l) are not determined from a 


numerical innovation model in principle. We should pay 
attention to these ambiguities in the system identifica- 
tion of open loop transfer functions. From Eq. (11) and 


the above relations (19) and (20), we easily have 


“1) 6f,(271 l) o 
a? (272) ee ae -öp (a7) m,(z ~~) 2| any 
G3,(z-*) G3o(z-*) 0 mg(z*) 


Inversely, we can have the relations (19) and (20) from 
Eq. (21). As mentioned above, whether transformation 
formula (10) and (11) can be used or not is important for 
identifiability of both open loop transfer function ma- 
trices Fio(z*) and Fo3(z+). Such a condition is satis- 
fied in the case where P is a block diagonal matrix. We 
will examine such conditions. 


V.2.2 Minimum Phase Feedback Systems 





Substituting a matrix ever into the algebraic Riccati 
equation (13), we can easily understand that P=GVG! is a 
particular solution, since HG is nonsingular. Suppose that 
P=GVGT is a stable solution, then we have 


K=PH'r ~l=gvoTHT (uaveTHT) -1=G6(HG)-!. (22) 


Since the closed loop transfer matrix of innovation matrix 
is given by 


G7 (z7l)=H(1-0z71)-1k 


=H(I-@z~!)~16(HG)-1=Gp(2~+) (HG) “+, (23) 
we can understand that the my (z7 1) a Eq. (21) are con- 
stant block diagonal elements ees (HG) *. In this case, we 


have an equivalent 4-block feedback model from Eq. (22), 
since G and HG are block diagonal matrices. As P=cvc! isa 
steady solution, it is necessary to check its stability, 
Since P used in the expression of the innovation model 
must be a stable particular solution. This will be exam- 
ined in the following. 
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If a reactor plant is assumed to be a minimum phase 
system, then all system zeros in the gi plane lie out- 
side the unit circle in the complex plane. For the 
system, the minimum phase system is expressed by the 
requirement that eigenvalues of @®[I-G(HG) +H] lie inside 
the unit circle in the complex plane from Eq. (8): 
feig{® (1-G(HG) ~1H)}I <1. To examine the stability of 
P=GvG', we can use the property that all eigenvalues of 
® [1-G(HG) 1H] are inside the unit circle, if the reactor 
under consideration has a minimum phase property. We can 
easily understand that eigenvalues of ® (I-G(HG) 74H) are 
stable from Eq. (C.5). From Eq. (22) it should be noted 
that the condition of minimum phase corresponds to the 
stability condition of the innovation model (15) under the 
assumption of minimum phase, since ® (I-G(HG) ~1H) =@ (I-KH). 


V.2.3 Independence of Random Forces 





Suppose that a reactor is assumed to be a nonminimum 
phase feedback system, then some system zeros in the case 
of the zh plane lie inside the unit circle in the complex 
plane, though all system poles are outside the unit circle 
in the zl plane. In this case, we need one other assump- 
tion to estimate the open loop transfer functions, such 
that P is a block diagonal matrix. It has already been 
shown in control theory [57,58] that the block diagonal 
matrix of P is a necessary and sufficient condition for 
identification of open loop transfer functions in a 
4-block feedback system, and that the assumption of inde- 
pendence of noise sources is one of sufficient conditions 
to obtain the block diagonal matrix of P. The independence 
of random forces means that V is a block diagonal matrix. 
After straightforward calculations, we can see that 
® g= (I-G(HG) ~ 1H] is also a block diagonal matrix in the 
feedback system, since matrices H, G and HG are block 
diagonal matrices: 


® 0 
vaso rrome imd L (24) 
® 
22}: 
where 


A12 0 
oi" -1 cB.) -tC 
S Ägg 0 
O22" | 4 Bo (CyBy) IC By (CoBy)~1c 
292t02B2) C21 Ag7AgBotC2B2) C3 


From the symplectic form of a Riccati equation, Eq. (C.10) 
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in Appendix C, it can be concluded that a stable particu- 
lar solution of pS=p-GvG! is a block diagonal matrix, 
since cvc? is a block diagonal matrix under the assumption 
of independence of noise sources. Otherwise, P is not 
guaranteed to be a block diagonal matrix. 


If GVGT is a block diagonal matrix, HĪRTIH is a block 
diagonal matrix, since H is a block diagonal matrix. From 
Eq.(C.10) we have 


sys Tp-lyys 
© Ju$=UF A +H Rl HUgA 
U = OQUSA. (25) 


From Eq. (25) we can see that particular solutions, U$ 
and Us. are block diagonal matrices, since gq and HITRIH 
are block diagonal matrices under the independence assump- 
tion of noise sources. Then, 


PS=ugu$-! or P=PS+GVG, (26) 
which is also a block diagonal matrix. 


Since P is a block diagonal matrix, then from 
K=PH! (HpH?T)~1 we can have the same type of relationship 
as the original transformation formula (10) and (11) 
between closed and open loop transfer function matrices. 
Since H and ® are common between Ge(z~t) and G7 (z-1), no 
difference is found in both Fy9(z74) and Fo,(z7), because 
the equivalent feedback system has the same type of feed- 
back structure as the original one. In this case we can 
estimate in principle the original transfer function 
matrices via the transformation formula (10) and (11) as 
already mentioned in Section V.1. Therefore, we can have 
an equivalent 4-block feedback system corresponding to the 
numerical innovation model (16), 


¥4(n)=Fy9(271)yo(n)+F7 (z71) 7 4(n) 
¥o(n)=Fo;(z-1)y,(n)+FZ (z71) y(n), (27) 


Hence, Eqs. (19), (20) and (21) are held. 





V.3 A Simple Example 





Let each open loop transfer function of a power 
reactor with feedback loop be approximated by the first- 
order lag type of transfer function: 
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F19(z71)=c)9(1-a,9z71)~1b, 5271 

Bai (2 Yeegg (1-agye ) baie? (28) 
F,(271)=¢, (1-a,27 )~1b, 
Fo(z~+)=c9(1-agz71)"1bg, with ¢,b,=cybo=1 


where Fyo(zty and Fo, (27-1) are considered to be zero- 
power and feedback transfer functions, respectively. In 
this example, d=4, q = 2, and constant matrices of state 
space model (6) are defined by 


4812 0 by2¢91 Þ12°2 0 0 
O= 0 ay 0 0 G= by 0 
be1¢12 Þ21€1 421 0 0 0 
0 0 0 ao J. 0 bo), 
H=|C12 1 9 0 and v=| Y11 12 
Do G Say Sg hs Vig, Yaa) 


Poles of the feedback system are defined by 4 roots of the 
characteristic equation, 


det(I-0z71)=0, (29) 
or, (1-a,2z7!) (1-agz71) {1-(ayptag,) 271+ (ay oa91-w)z~7}=0, 


where w=by 291° 12°91 and by 2¢1270. If w>0 (or b91€91>0), 
then the feedback is positive, and w<0 (or bg1€91<0) means 
the negative feedback. Zeros of that are given by roots of 


det s(z71)=0; 
z-lea,.7}, agı t, and two infinite zeros. (30) 


In the example, eigenvalues of generalized eigenvalue 
problem Eq. (C.3) in Appendix C are 0, 0, ajg, 491, a12 
a21 » ©, ©, ©, and ©, By using stable eigenvalues, a 
corresponding innovation model can be determined from a 
particular solution of the symplectic form of the Riccati 
equation in Appendix C as in the two following cases. 


V.3.1 Minimum Phase Systems 





Since the reactor is a minimum phase system, then 
stable eigenvalues of Eq. (C.3) in the example are inverse 
numbers of system zeros, 0, 0, ajo: and ag» since system 
zeros ajz! and a21! are outside the unit circle. It should 
be noted that the zero-power transfer function is subcrit- 
ical under the condition, since laygl<1. Solving the 
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symplectic form, we have a set of 4 eigenvectors: 


1 101 9¢]" 0 0 
uf o 0 1 a9C21C3 
= 0 0 0 a21 (31) 
0 0 0 0 
us o o0 o o0 
0 0 0 0 
0 0 0 0 
Therefore, we have PS=U3U3"*=0, and then we obtain the 
particular solution, P=GVG*. From the calculation of K as 
in Eq. (22), the innovation model has the same structure 


of the original feedback model, which has the 4-block 
feedback structure. From Eq. (10) original transfer func- 
tions, Fio(z-+), Fo1(z 1), F, (27+), and Fo(z-*), can be 
identified in principle, since cyb,=Cobo=1l. If cb #1 and 
Cobo#l, then F,(z71) and Fo(z ty have ambiguities in 
magnitude. 


V.3.2 Nonminimum Phase Systems 





Suppose that one zero, z-1ea,5). is inside the unit 
circle (laygi>1), then stable eigenvalues of Eq. (C.3) in 
this case are 0, 0, ag» and a121. The condition means 
that the zero-power transfer function is supercritical, 
and then the reactor is stabilized by the negative feed- 
back Fo, (27+) as in the next section V.3.3. Solving the 
symplectic form, we have a set of 4 eigenvectors: 


X1 a1¢10¢74 0 0 
X9 a12 0 0 1 
sS > 
Uz Xg 0 1 A9C91C9 
= X4 0 0 a21 (32) 
0 0 0 0 
0 0 0 0 , 


where 
X= (a19!-a,9)71(a,51-a,)e 907! X2 
X9=611b]7(aya,91-1)c19 
x= (12'-ap1) + (aj3*-a2)c2103*x4 
x4=-811(V12/V22)b7 b31 (aļa13*-1)c12 
811=Y22(Y11Y227Y12) ©: 
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Here the first column of Eq. (32) is different from that 
of Eq. (31), because a49 is not a stable eigenvalue. In 
this case, we have a particular solution, Pp=UĝU$ 1. That 
is, 


Pig ~Cy2a Perte 0 o0 
-€1281P11/ (C1812) (C4 984)°Py4/(cyayQ)" 0 0 
Phe 0 o 0 01|(33) 
0 0 o of, 
where 3 1 1 2 2 
Py 17442" (419-449 844 C1 9° (ay -ay9) 
Since we have P=GVG!+P,), we can calculate K: 
Ki ky 2° 
K= k11 k12 (34) 
0 0 
0 by |, 


where 
nkj1®=(c12+ac1)P11V22 
7140 (Cy 9+ @ Cy )Py4V99tby V4 1V99-b4V4 2" 
nkjg2®=-(c]2tac])P11Y12 
7 ky =e) by (cy + acy) P11 Vy 9 
7 =(Cy9+aCy)°P11V299+V41V29-V19 
a =-C1281/ (C1812). 


From the closed loop transfer function matrix of the 
numerical innovation model, G (z71)=H(I1-0z7}) İk, we have 
an equivalent 5-block feedback system described by 


yı (n)=F]g(z7l)yg(n)+Fi (2-4) 7 y(n) +F3(z74) 7 9(n) (35) 
¥9(M)=Fo3(z71)y4(n) +Fg(z-1) 7 (n), 
where 

F,9(z~1)=¢,5(1-a,9z71) 71h, 927 
12 agts-ay3e 122 
Fo (Z27})=c3}(1-a2127})71b9]27 
Fj (27})sc]9(1-a]2271) 1k] ]ĉ+c](1-aqz7}) lk] 
Fo(Z7})=c9(1-a9271) lbp 

F3(2-1)=¢49(1-a4 9271) tk]g2+c] (1-a,z71) Tlk] gP. 


1 
1 
b 


Using transformation formula (10), we can have the feed- 
back loop transfer function, Fai (z7t)=G37 (271)6,7 (271)-1, 
but we cannot identify the zero-power transfer function, 
Fyo(z*) due to the appearance of an additional loop 
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F3(z-1). Suppose that the assumption of independence of 
noise sources holds, Vy9=0, then we have ky 92=k, 9°=0 in 
Eq. (34), and F3(z~1)=0 in Eq. (35). Since we have an 
equivalent 4-block feedback model, then the identifiabili- 
ty of F)2(z7+) is recovered in the case of Vj,9=0. 


V.3.3 Application to Power Reactors 





In this section we will check situations for a power 
reactor in which the condition of minimum phase is broken 
and also in which the reactor is in a stable region. From 
Eq. (30), the condition of minimum phase holds in the case 
where laqygl<1 and lag, I<1. We need a condition for the 
reactor stability such that a power reactor of nonmlnimum 
phase may be operated steadily. From Eq. (29) and the Jury 
method [61] (the linear stability criterion in discrete 
time), we have the following three conditions such that 
all poles of the feedback system with Eq. (28) are outside 
the unit circle in the z` + plane, or roots of the quadrat- 
ic equation of x, x2~ (a1 94891) X+819891-w=0, lie inside the 
unit circle in the complex plane: 


1) 1a,949,-Wl<l 
2) 1-a19-a91 +81 9891 -W>0 (36) 
3) 1+819+A891+819891-W>0. 


Domains of the stable region and the minimum phase region 
in the parameter-space, (ajo, a91)» are shown in Fig. 6. 
Figure 6a is in the case for w=0.5>0 (positive feedback), 
and the case for w=-0.5<0 (negative feedback) is shown in 
Fig. 6b. As shown in Fig. 6a, all stable regions are 
included in the domain of minimum phase, and it can be 
concluded that there is no stable system of nonminimum 
phase with a positive feedback. From Fig. 6b, we can find 
a domain where the system with negative feedback is stable 
and of nonminimum phase. The domain corresponds to usual 
operations of power reactor with negative feedback. As 
there is an open question of the PSD observed in JRR-1 
[62,63], it is necessary to describe a nonminimum phase 
system where the reactivity is more than critical and 
the negative feedback transfer function makes the reactor 
stable. This type of instability in the open loop transfer 
function is called the soft mode instability [64]. There 
is another type of instability. In the case of an open 
loop transfer function with more higher-order lag type, 
there occurs the hard mode instability [64], which is also 
related to the nonminimum phase. Therefore, the present 
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Non-Minimum Phase 


Mininum_Phase! 





Aja Ay 

a) Case W=0.5 b) Case W=-0.5 

Positive feedback Negative feedback 

Figure 6. Stable region of reactor with feedbacks in the 
parameter space (a9, a21) 


theory is useful for system identification in such situa- 
tions, since there still remains an open question [20] to 
determine loop transfer functions even in the case of 
minimum phase. 


VI. CONCLUSION AND REMARKS 


Forward problems via the contraction of information 
were discussed in SectionII.1 and III from the projec- 
tion to time series data space of measurable variables 
due to the observation, and in Section IV from the elimi- 
nation of ARMA zeros due to the AR model analysis. Inverse 
problems were discussed in Sections II .2 and V. The AR- 
type model useful in the inverse problems is discussed in 
SectionIV. 


In Section II, we discussed the framework of a uni- 
fied theory of reactor system identification and stochas- 
tic diagnosis. Innovation models are essential in this 
framework from the viewpoint of the forward and the in- 
verse problems. That is, we can summarize the relationship 
between the first and the second type of state space 
models, innovation models and Riccati equations as in Fig. 
7. The second type of innovation model is recognized [65] 
to be the same as the Kalman filter in the case of the 
second type of state space model with correlated noises as 
in Appendix D. The second type of Riccati equation is also 
related to the generalized eigenvalue problem and the 
symplectic form. 


52 K. KISHIDA 


Discrete System === Kalman Filter Formulation 


and Observation Eq. (D.5) 
Eqs.(2.1) and (2.2), 
Eq.(5.6), or Eq.(D.3) Eq. (D.7) 
Eq. (D.10) 
First Type of <q Second Type of 
Innovation Model Innovation Model 
Eqs.(2.3) and (2.4) Eqs. (2.19) 
or Eq.(5.12) and (2.20) 
Eq. (D.9) 
First Type of q— Second Type of 
Riccati Equation Riccati Equation 
Eq.(2.5), Eq. (5.13) Eq. (C.2) 
or Eq.(C.1) or Eq.(D.6) 


Generalized Eigenvalue 
Problem 
Eq.(C.3) or Eq.(C.8) 
Symplectic Form 
Eq. (C.10) 


Figure 7. Relationship of models and equations 


In the practical identification of an engineering 
plant by a stochastic model, effects of observation noise 
cannot be neglected [66]. In a system with observation 
noises, it is important to keep in mind whether an inno- 
vation model belongs to the first type of Eqs. (2.3) and 
(2.4) or the second type of Eqs. (2.19) and (2.20), since 
both approaches with observation noises become more simi- 
lar styles than ones without them. Paying attention to 
invariant quantities in both representations, we can 
clarify the stochastic analysis. 


In Section III , we discussed the physical foundation 
of the system model by the system size expansion method. 
It is assumed in the method that the system under consid- 
eration has a uniform structure in space. Furthermore, it 
is also assumed that we can neglect spatial problem by 
setting a suitable sampling time. Hence, we can have a 
point model in the suitable "window" from the viewpoint of 
contraction of information as mentioned in Introduction. 
In a different time sampling, the effect of space- 
dependence is not always negligible. If we adopt the cell 
size expansion, like a diffusion model, instead of the 
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system size expansion, spatial dependent phenomena occur- 
ring in a power plant can be described. 


Under the assumption of a point model we have intro- 
duced the ARMA(2,1) model of neutron number in a zero 
power reactor. Recently, a similar approach to ARMA models 
has also been developed by Tri [67-69]. Usually, the time 
series approach treats a reactor as a "black box". In the 
theory of reactor noise a reactor is treated as a "white 
box". The aim in Section III.2 is to bridge conventional 
reactor noise theory (white box theory) and a time series 
analysis (black box analysis). The approach may be called 
a "gray box analysis" for stochastic diagnosis [34]. 


In Section IV, we reported the pole location rule of 
AR model and the pole separation rule of AR weights, when 
the AR model order is so large that asymptotic properties 
hold. On the other hand, properties of the AR model for an 
intermediate model order should be discussed, which is 
most important for practical use. We can give a comment on 
such properties of AR poles and weights as follows: 


From the rule for AR poles, a scalar AR model of 
order m with AR poles aj can be expressed as 


= m = 
d (m,z ‘yy (n)= A (1-8 32 T)y(n)=eq(n). (1) 
When the noise source of the AR model has unit variance, 
the logarithmic PSD is given by the sum of logarithmic PSD 


corresponding to each AR pole, 


P(w)= -2log!1-f z+}, (2) 
i=1 J 





where z=exp(iw). Each term in Eq. (2) corresponds to the 
PSD of the AR pole. Hence we can introduce a tool called 
the PSD contour-map, which is proposed for interpretation 
and evaluation of the AR poles in terms of PSD [70]. This 
tool can visualize the structure of AR poles for interme- 
diate AR model order, and we can discuss visually how the 
information of poles and zeros of ARMA model outside the 
convergence circle is embedded in the AR model. 


By comparison of Eqs. (4.23) and (4.26), there re- 
mains the factor (1-aa@5™)-1 of weight in an intermediate 
AR model order changes, though it disappears in a large AR 
model order. When the real zero a7! is near the absolute 
value of the complex system pole a zt, the weight of the 
AR pole does not converge smoothly to that of the system 
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pole inside the convergence circle, and for intermediate 
AR model order changes, the factor becomes oscillatory. 
Hence the decay ratio, which is used for the evaluation of 
reactor stability by using the AR model fitting, loses 
robustness for the transient model order dependence due to 
statistical errors. When the system is near the hard mode 
instability, the robustness of the decay ratio is lost in 
the estimation for an AR model order dependence [71], 
since a dominant pair of system poles and a real zero are 
near the unit circle. 





In Section V, a unified theory in a feedback system 
was shown for reactor diagnosis. First, a system is as- 
sumed to be described by a state model with feedback 
mechanism, and the square feedback system is treated. 
Secondly, we have a theoretical innovation model equival- 
ent to the stochastic feedback system from the viewpoint 
of contraction of information. Finally, we obtain a data- 
oriented innovation model, taking advantage of singular 
value decomposition of the Hankel matrix. Comparing the 
physical state model, the theoretical innovation model and 
the data oriented innovation model, we have pointed out 
how to obtain open loop transfer functions from observed 
time series data. From the formalism it can be found that 
there is a constant particular solution, P=Gvc! , of the 
Riccati equation (5.13) or (C.1). Then, identifiability in 
feedback systems can be discussed easily: If the feedback 
system is minimum phase, open loop transfer functions 
between state variables are identifiable. If the feedback 
system is of the nonminimum phase type and random forces 
are independent of each other, open loop transfer func- 
tions are identifiable. In both cases, innovation models 
equivalent to original feedback systems are 4-block type 
of feedback structure. Even in the case of the nonminimum 
phase where a 5-block model is equivalent to the feedback 
system, open loop transfer functions corresponding to 
stable zeros are identifiable. Though these conclusions 
had already been discussed implicitly or explicitly by the 
Anderson school from the viewpoint of control theory, the 
question of identification of open loop transfer functions 
is still open in reactor diagnosis as in the benchmark 
tests held in the SMORN meetings [10,14,16-18,20]. Espe- 
cially in a power reactor with a negative feedback, the 
property of nonminimum phase is related to the case where 
the zero power transfer function is supercritical. Fur- 
thermore, the innovation models, of which the zeros are 
related to the pole location and weight rules of an AR 
model as in Section IV, are useful for reactor diagnosis 
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and system identification, and then we have shown the 
unified theory of innovation models in feedback systems. 


It should be noted that the identifiable conditions 
of minimum phase and independence of random forces are 
sufficient. If we can find another suitable approach, 
there is a possibility that open loop transfer functions 
are identifiable from an equivalent 6-block feedback 
transfer function model, since the equivalent feedback 
system has the same matrices ® and H as the original 
ones. 


Finally we have formulated a method in a linear 
stationary stochastic process under the condition that 
observed time series data are available, and examined 
problems due to contraction of information. The spirit of 
the present paper may be useful in nonlinear or nonsta- 
tionary cases. Recently nonlinear phenomena have been 
studied by Saito [72], March-Leuba et al [73], and Bergdal 
et al [74] for reactor diagnosis. Especially, a "limit 
cycle" (nonlinear oscillation) is a important keyword. The 
contraction of information and the stochastic equivalence 
presented in this review paper are also important for 
nonlinear analysis in stochastic diagnosis, if the second 
moment is used. 


Appendix A: Some Matrix Results 
Lemma A.1 (inverse of block matrices): For a matrix 
A B 
S=ic pj, 


A~1+a71gq-1ca-1 -a-1BQ71 


s"i. 
-QCA7 t} ql 
A-1 0) [-a-13 k i 
= E + 5 QACA T).. (A.1) 


with @=D-caA~1Bp under the conditions that A and D are 
square matrices and that A and D-CA~1B be nonsingular. 


[Proof] From the direct multiplication 


-1 =$ 
A B A 0 -A =B 
ss~ls + a-t¢-ca7? 1) 
cC D 0 + I 
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[et] {f]esears 
(ea et 


we have (A.1) 


Lemma A.2: Similarly, consider a matrix 


AB 
S=lc pj, 
where A and D are square matrices, and assume that D and 
A-Bp71c be nonsingular. Then we have 


0 0 I 
saf | wight -BD7}), (A.2) 


with R=A-BD7~!c, also proved by direct multiplication. 


By comparing the upper left blocks of (A.1) and (A.2) 
we have the following matrix inversion lemma. 


Lemma A.3 (matrix inversion lemma): 
(A-BD~ 4c) ~1=a~14a71B(p-ca71B)-1ca-1 (A.3) 


Lemma A.4 (determinant of block matrices): Consider a 
matrix 
Ẹ B 
Selo pj. 


Let A and D be square matrices, then 
det S=det A det(D-CA71B), when A is invertible, 

and (A.4) 
det S=det D det(A-BD~!C), when D is invertible. 


[Proof] Since A is nonsingular and the determinant of a 
unimodular matrix is 1, then we have 


I -A71B A 0 4 
det S=det S =det 1, |pdet A det(D-caq!B). 
e 2 c D-CA`}B 


The second relation can be proved similarly. 
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Corollary A.4.a: 


A 0 
det i =det A det B. (A.5) 


From Lemma A.4 we have Eq. (A.5) and the following corol- 
lary A. 4.b, 


Corollary A.4.b: 


A C 
det =det A det B. (A.6) 
0 B 


Appendix B: Conditional Gaussian Distribution 


Let X have an n-dimensional multivariate Gaussian 
probability distribution with a zero mean vector and a 
covariance matrix Pax and be split as X:=(YT z1)T, where 
Y is an m-dimensional vector and Z is an (n-m)-dimensional 
vector. Then, the covariance matrix is rewritten as 


> £ 
n YF: yz 
se| | 


2 zy Laz 





The conditional probability of Y under the condition that 
Z is given is expressed as 


P(¥1Z):=P(Y,Z)/P(z) =P(X)/P(Z) 
whoa 802), Fe 1171 2expt-(XT3 pp X-ZT2 5 12) /2) 
= (20 )~M/2) q~1)-1/2exp{-(y-¥)TA(Y-¥)/2},  (B.1) 


where Y is the conditional mean vector and A7l is the 
variance matrix: 


‘ = -1 
t- f vpcvizvay-2y,2 22 Z 


votes 5 (B.2) 


~t . ee y 3 _ 
A -fo Y) (Y-Y)TP(YIZ)dY=E 4-2 i 





since the inverse matrix of È yx is given by Eq. (A.2). 
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Appendix C: The Matrix Riccati Equation 


We examine a stable solution of an algebraic Riccati 
equation, 


P=0 {P-PH! r ~lyp}oT+evel. (C.1) 


Since matrix Riccati equations have been studied exten- 
Sively in applied mathematics or control theory [75-80], 
we can take advantage of known results. By putting 
P=PS+GVGT, Eq. (C.1) can be transformed to an algebraic 
Riccati equation with a familiar style in control theory, 


pS=@PS@T-(opSH!+s) (HPSHT+R)~1(HPS@T+S)+Q, (C.2) 
where Q:=@GVGT@T, gs:=@cGvcTHT, and R:=HGVGTHT. For the 


algebraic Riccati equation (C.2) a stable solution PS can 
be derived via a generalized eigenvalue problem, 


MSU=NSUA , (C.3) 
where A=diag(d,, ERA Aq) is a diagonal matrix, of which 
elements are d stable eigenvalues of MS-ANS, 

oTo wt I 00 
MS={-Q I -S and NS=|0 © 0 
st o RJ, o -H OJ. 


Let US be a set of eigenvectors for d stable eigenvalues 
given by Eq. (C.5), and the (2d+q)x d matrix U be decom- 
posed into 


uS=(ug? ug? ugt)T. 


A stable solution of the algebraic Riccati equation is 
given by 


pS=ugu$ 1. (C.4) 


Since M-AN is multiplied by unimodular matrices 
into 


I 0 0 I 0 0 
-DPS I 0f (MS-A NS) ps I 0 
HPS Oo I -KToT -1 (HPSHT+R) IH 1 
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(I-KH)T@T-ar -a2HT(HPSHT+R)~1H HT 
= 0 I-® (I-KH) a -oPSyT-s 
0 0 HPSHT sR J, 


where K:=(P8+GVG!)HT(HPSHT+R) 71, then we have 


det (MS-2 NS) 
=det{(1I-KH)??T-, I1}det{1- (I-KH) a }det (HPSHT+R) .(C.5) 





From Eq. (C.5) it can be concluded that stable eigenval- 
ues of (M-AN) are equal to those of ®(I-KH), since from 
the stability of the Innovation model, Eq. (S215), all 
eigenvalues of ®(I-KH) lie Inside the unit circle in the 
complex plane. Here we have used the matrix result that 
eigenvalues of @®(I-KH) are equal to those of (I-KH)®, 
Since both characteristic equations are the same: 
eig{®(I-KH)}=eig{(I-KH)®}. 


We will give a simple proof of the above statement, 
Eq. (C.4). If @g:=®-SR7'H and Qg:=Q-SR İST, the time 
evolution form of Eq. (C.2) is 


PS(n+1)=@PS(n) oT? 
-(®PS5(n)H?+S) (HPS(n)H! +R) ~!(HPS(n)®T+S8)+Q, (C.6) 


and can be written as 
PS(n+1)=0 g{PS(n)~1+RG1} to f +a. (C.7) 


where Rg:=H'R-1H. If we put Ug(n)=PS(n)U,(n), then Uy and 
Ug are eigenvectors for stable eigenvalues of a matrix V 


-T -T 
U © ®q7R U U 

nery | Sg a a ya). (c.8) 
Vg Qaa Pata aq Ratz 03 


After some matrix calculations, we have the generalized 
eigenvalue problem (C.3) from the above eigenvalue problem 
(C.8). From Eq. (C.5) stable eigenvalues of W are inverse 
of those of ®(I-KH), and are denoted as A, eigenvalues of 
which are stable. W-lwwediag(a-t, A), where W is a 


transformation matrix. 
U U Wi, W ATR o U 
| t |an 10 il 11 = |r| a 
Ug U20 W21 W220 A Ugo): 
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Finally, PS=lim P§(n)=lim Uy(n)U, (n)~+=W5,W,, ~t=ugus-?, 
inally lig Pet) ite U2tayG 4 21W11 =U2U{ 


Furthermore, we can change the generalized eigenvalue 
problem, Eq. (C.3), into a symplectic form, since 
R= -HGVGTHT #0. Eliminating U in Eq. (C.3), we have a 
symplectic form of the Riccati equation (C.2), 


oT-uTr-ts? olf u$ 1 H'R-1n- |f uF 
= N A.  (C.9) 
-Q+SR7ts"1 1 Jj] ug 0 o-sR-tH}L Ug 


Here a=sr-isT, since HG and V are nonsingular. 





From Eq. (C.6) we have a symplectic form 


T o }f us 1 H'r-1H1f{ u$ 
itl ea pea | e 
0 I U3 0 Oq U5 
where A has d stable eigenvalues of (MS-N82)} in alagonal 
elements, since ® qg:=® (I-SR™ ly) = =) (I-G(HG)~ ly), Q=SR™ igT 
Suppose that there is a nonsingular matrix P; Tt lo aT 
=diag(2]» Agrees, Age O ... 0), that is, Ay (i=1 
2, ..., d-q) and 0 of mu ti licity q are d stable eigen- 


values, and that uj;= (uj; uzi) is an eigenvector for an 
eigenvalue À į» then we have 


f ill a f wa | he aaa 
o IJ] a3; 0 A a3; 
(f=1, 


Bp ace OD 


where uf,=T™ Tas, and uS,=TO3,. If the Jordan block for 
2 =0 appears in T™ lot, a detailed treatment is necessary 
as in the paper [80]. For ij#0, we can see from Eq. (C.11) 
that ay has 1 for the i-th component and 0 for the other 
components. For j4=0 of multiplicity q, each Up} is 0 and 





a set of u ‘g is a To ae eigenspace, since 
rank ® y=d~-q. In this way, ve have =0 and uf is a non- 
S_ 


singular matrix. From Uy= =T" TUF and ug-TU§, we can have 
PS=ugu ‘=o. (C,12) 


This means that a particular solution P=GyGT is stable. 
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Appendix D: Transfer Function Models and The Kalman Filter 
Second Type of State Space Model [65] 


In the system described by the following transfer 
function representation of Eqs. (Zet) andi - (2.638), 
y(n)=Gp(z-1)f(n) with Gp(z-1)=H(I-@271)-1, we can have 
another type of state space representation from f(n) to 
y(n), since H(I-@z71)71 =HeH(I-o271)-log-t, Taking a 
coordinate transformation on Eq. (2.1); 





x$(n)=x(n)-f(n), (D.1) 
we have another type of a state space model, 


x$(n)=x(n)-f(n)=®x(n-1)=®xS(n-1)+@f(n-1) 
y (n) =Hx(n)=Hx$(n)+Hf(n). (D.2) 


Eq. (D.2) is the representation of the second type 


x$(n)=®x8(n-1)+u(n-1) 
y(n) =HxS(n)+v(n), (D.3) 


where white noises, u(n):=@f(n) and v(n):=Hf(n), have 
following correlations; 


Gaia (m) v(m))7} dion a. el | 
S uim vim = = 
v(n) HVoT gypT| 2 gt! y| Be: 


(D.4) 





The state space representation (D.3) has a popular form in 
the case of correlated noises in control theory. For the 
state space model (D.3), a Kalman filter [29,81,82] is 
well known as a Markov representation 


x§(n+1{n)=®xS(n{n-1)+L7 (n) 
y(n) =HxS(nin-1)+7 (n), (D.5) 


where a conditional state vector x*(n|m):=E{xS(n)1Y(m)}, 
dxq matrix L:=(®PSHT+S)(HPSHT+R)~!., and PS:=1im PS(n), 


no 
with P§(n):=E{(x§(n)-x5(nin-1)) (x8(n)-x5(nin-1))7}. PS is 
a stable solution of the second type of algebraic Riccati 
equation 


PS=@PS@T-(mPSHT+4S) (HPSHT+R)~1(HPS@T+S517)+Q,  (D.6) 


which is the same type of equation as Eq. (C.2). 
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Let examine a relationship between the second type of 
innovation model, Eq. (2.19) and (2.20) and the Kalman 
filter (D.5), since input and output vectors, y(n) and 
y(n), are the same. Taking the conditional ensemble 
average of the coordinate transformation (D.1) under the 
condition that time series data Y(n-l) are given, we have 





x(n|n-1)=xS(nin-1), (D.7) 
and then, 
x(n)-x(n{n-1)=x$(n)-x$(n[n-1)+f(n). (D.8) 
Due to the randomness and the whiteness of f(n), and from 


Eq. (D.8), we obtain a transformation between solutions of 
the Riccati equations (2.5) and (D.6), 





P(n) :=E{(x(n)-x(nIn-1)) (x(n) -x(n{n-1))7} 
=E{ (x$(n)-x5(n{n-1)) (x8(n)-x$(nin-1))T}4+V 
=PS(n)+V, (D.9) 


and also we have a coordinate transformation between the 
conditional state vectors, 


x(n|n)=E{x(n) |¥(n) } 
=E(x(n) [Y¥(n-1)}+E{x(n) 1/7 (n)} 
=X(n|n-1)+Ky (n). (D.10) 





By using the coordinate transformation (D.10), the first 
type of innovation model (2.3) can be recognized to become 
the second type of innovation model (2.9) or (D.5). From 
Eq. (D.9), the coefficient matrix of Eq. (D.5) is the same 
as that of Eqs. (2.19) and (2.20): 





L=(®PSHT+S) (HPSHT+R)~1=0PH!(HPHT)~1=@K, (D.11) 


GT K=(PS+V)H! (HPSHT +R) l=ppTr 1, (D.12) 
Therefore, it is concluded as in Fig. 7 that the second 
type of innovation model, Eqs. (2.19) and (2.20) is the 
same as the Kalman filter (D.5) in the case of correlated 
noises, and that the second type of Riccati equation (D.6) 
is transformed to the first type Eq. (2.5) by using the 
transformation (D.9). 
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PARTICLE TRANSPORT 
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I. INTRODUCTION 


In the development of stochastic simulation methods applied to radiation 
transport problems sensitivity analysis and perturbation calculus has so far been a 
neglected area. In proportion to the rich literature on the theory of stochastic 
simulations and the countless papers describing its application to a specific class 
of problems, the aspect of Monte Carlo perturbation algorithms has been dealt 
with only sporadically in the context of specific problems. 

Until quite recently most Monte Carlo programs had to be run on expen- 
sive mainframe computers. The advent of inexpensive systems based on mass 
produced CPUs, as for example powerful workstations, Transputers or accelera- 
tion boards, changed this situation drastically. In the light of these facts the role 
of Monte Carlo as a powerful tool of numerics will have to be re-evaluated. 

Despite this advantage the probabilistic nature of the method imposes cer- 
tain limitations if it is applied in a straightforward manner. Parameter studies 
which constitute a great fraction of any complex system analysis prove, for ex- 
ample, impossible if the differences of the sampled responses are of the same 
size or smaller than their uncertainties. Actually, more often than not, the pertur- 
bation of a response as a consequence of small parameter changes, turns out to be 
of greater interest than the response itself. 

This review paper deals with two methods allowing for the calculation of 
small parameter variations. One is based on correlation techniques and the other 
deals with differential operator sampling. In the first part of the paper the author 
describes the theoretical basis of the two methods, which was already published 
in the Journal of Computational Physics, Vol. 111, No. 1, March 1994 pp 33-48 
(Synopsis of Monte Carlo Perturbation Algorithms). For completeness it is re- 
peated here as its equations have to be referenced frequently. 

While correlation techniques provide straight-forward perturbation esti- 
mates, differential operator sampling allows for the determination of gradients 
and higher order derivatives. The latter method offers the possibility to perform a 
posteriori perturbation calculations by the use of a multivariate Taylor expansion 
and to analyze the stochastic simulation process in terms of additional criteria, 
such as: 

- the determination of the sensitivity coefficients, 
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- the estimation of the variance of the target quantities as a function of the 
uncertainty of the input parameters, 

- the eventual adjustment of the parameters such that the discrepancy between 
measurements and calculations is minimized, 

- the optimisation of non-analog procedures. 

The terminology used in the paper is close to the one in (19 and 13). In 
Chapters 3 and 4 correlation, multiple correlation and differential operator samp- 
ling is applied to the simple Monte Carlo integration algorithm. It is proved that 
correlated perturbation estimates have under certain conditions a finite relative 
variance for arbitrarily small parameter changes. 

Chapter 5 deals in its first part with correlation techniques applied to linear 
matrix equations describing a random walk problem. The second part illustrates 
how gradients and higher-order derivatives can be determined for the same type 
of equations. In this part it is also shown that there exists an interesting relation 
between correlation algorithms and differential operator sampling. Finally, the 
functioning of the two methods is demonstrated by comparing the results of ana- 
lytical solutions of slightly different systems of three linear equations with Monte 
Carlo perturbation estimates. 

In Chapter 6 it is shown how the methods can be extended to to Fredholm 
type integral equations. Chapter 7 deals with the application of perturbation and 
sensitivity algorithms to the neutral particle transport equation. A number of 
special cases is treated. They include density changes of whole mixtures of iso- 
topes as well as partial cross-section changes. Also the special cases of perturba- 
tion estimates caused by a black absorber, a strong scatterer, or the voiding of a 
region are treated. Broad considerations are given to the material replacement 
option which allows for perturbation estimates caused by the exchange of one 
material (or cross-section set) by another one. 

The last chapter deals with Monte Carlo perturbation analysis in eigenvalue 
problems. Among the different approaches interest is focussed on the fission 
matrix approach which appears to be particularly suited for performing low-vari- 
ance perturbation estimates of the multiplication factor as well as all other target 
quantities of interest in a self-multiplying system. A list of examples dealing with 
3-D benchmarks concludes this part. All the schemes discussed in Chapters 7 and 
8 were realized in the Monte Carlo code KENEUR (41,42) developed out of 
KENO-IV (37). 
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II. TERMINOLOGY AND DEFINITIONS 


Stochastic simulation procedures, usually called Monte Carlo methods in 
nuclear transport analysis, always aim at the determination of a mean value 
which can be interpreted as being an integral or the solution of a (linear) system 
of equations. The functions to be integrated are probability density functions 
(pdfs) from which discrete variables are sampled randomly to perform the simu- 
lation process. Random variables are a set of real numbers which correspond to a 
probability distribution function. In the case of continuous variables, we define: 


g(ujdu = P[u S w < u+du]; 


g(u) is the pdf of u and gives the probability of finding the random variable u' 
within the interval fu,u+du]. As we shall see later on, it is sometimes more 
convenient to use the cumulative distribution defined: 


Glu) = |" g(x) dx and glu) = dGldu 


Since the pdf g(u) is always non-negative and normalized so that its integral over 
all u is one, it follows that G(u) is a monotonously non-decreasing function 
taking values from zero to one. 

The expectation E of a function f(u) is defined as the average or mean 
value of the function 


Elf) = [7 fl) alu) du = [Ru aG , (2.1) 


where G(u) is the cumulative distribution function of u'. In a Monte Carlo 
integration procedure f(u) defines the probability density function (pdf) from 
which a random variable is sampled. 

The special case in which u’ is uniformly distributed between @ and b 
results in dG(x) =dx/(b-a), leading to the expectation: 


Elf) = ew ea 


The variance V of the function fis defined as the average of the squared 
deviation from its expectation: 


Vi) = EC- BT} = [UF-ENÝAG . (2.2) 


Any product h=f-g (of an arbitrary function f and a pdf g with the proper- 
ties defined above) resulting in the same function h has the same expectation 
(2.1). This does not hold for the variance. The variance depends on f and g and is 
a measure for the convergence of a Monte Carlo integration procedure. 
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HI. CORRELATED SAMPLING 
3.1 Basic Considerations 


The function f of the previous chapter may depend on the variables .x; and 
on a set of parameters p;. Quite frequently it is of interest to provide Monte Carlo 
estimates of the mean of f which show the effect of a parameter variation Ap,. If 
such a parameter change is small, straight-forward Monte Carlo may fail, as will 
be shown, and special techniques must then be employed. 

In the following for the sake of brevity the mathematics is carried out for 
one Ap; only. For this reason the index j is dropped as well as the index 7 in the 
integration variables. The integration space is B=/0, / ]. 

In terms of classical Monte Carlo estimates the two integrals to be com- 
pared are calculated as: 


E(f*) = $ fRx;p+Ap)dG* (3.1a) 
and 
E(f) = f fix: paG ‘ (3.1b) 
where f* = f(x,p+Ap), f= flx;p) and 
dG = g(x,p)dx, (3.2a) 
dG*= @(x;p+Ap)dx. (3.2b) 


The expectation of the difference Af = f*- f is 


E(Af) = E(f*) - Elf), (3.3) 


and the sample variance of Af 


V(Af) = E{((*-f) - Ef -fP}= EUF- EPOUFEQP) 
= [V(ft) + Vf] - 2E{[f* - EPI - EOL, (3.4) 


where 
vir) = [reg rac" and vi) = | 0- EG AG | 


In case the two estimates are obtained by two independent Monte Carlo 
runs, the correlation term 2E/..} of 3.4 vanishes, so that the variance V(Af) is just 
the sum of the two variances V(f*) and V(f) 
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VAS) = [VEV] . (3.5) 


For small changes Ap the estimate is usually expressed by the relative 
change (differential quotient) Af/Ap. As the sum in (3.5) is positive and finite, it 
is now easy to show that for small parameter changes Ap the relative variance 
can increase without limit: 


dim, ViAfl4p) = Vf *) + Vif] /4p? > © - (3.6) 


If on the other hand it is possible to correlate the calculation of E(f*) and 
E(f) then the relative variance may assume a finite value: 


VE) + WOP — UEF : f) - EE] 
Ap? 
Proof: One possibility of correlating the Monte Carlo integration of 


fix;p+Ap)dG* with f(x;p)dG (Equations 3.1) consists in expressing dG* in terms 
of dG by the use of Equation 3.2. It leads to 


lim, Vidfldp) = Sa éan CN 


ihn : (x;p)dx _ g(x;p+Ap) 3.8 
dG* = g(x;p+Ap) on cp) a en dG (3.8) 


a so called "likelihood ratio". Formally this procedure is similar to importance 
sampling whose restrictions are valid in this context too. With 3.8 inserted in 3.1 
and then together with 3.2 in 3.3 we obtain: 


a(x; p + Ap) a(x; p) 


zoey) PP) wa; =i Me R 


E(Af) = hræ» + Ap) 


Note that the two functions f(x;p+Ap) and f(x;p) are now under the same integral 
and random variables are taken only from dG. This changes the Monte Carlo 
integration procedure to 


N 
1 8(%ni p + Ap) 
ar DJ) + 4p) — ——— 
N; E2 B(%nP) (3.10) 
and 
jn LS 3.10b 
= HD fie (3.106) 


where the x,'s are sampled from g(x;p) and inserted simultaneously in both 
equations 3.10a and 10b. The total number of samples is N in each case, while f* 
and f are sample means approaching E(f*) and E(f) for large N. 

To estimate the variance of the differential quotient 4f/Ap for small values 
of Ap, we develop in (3.9) the product h(x;p+Ap) =f(x;p+Ap)g(x;p+Ap) in a 
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first-order Taylor series in p 
A(x;p + Ap) = h(x;p) + ApS. xp) (3.11a) 


Ot x;p) 


= fixip)a(xjp) + 4p| LEP? exp) + f(x;p) 


kc) . (8.119) 


Using Equation (3.11a) the integrand of Equation (3.9) becomes 
9 +A g(xjp+Ap) _ D) me Abate E Oog 1 A sil 
f(xip + Ap) aap) fixp) ~ Ap flxip)| OTT per) t Z fixip)| 


Inserting Equation (3.11b) into equation (3.9) we obtain (after dividing Af by Ap 
and reorganizing the formula) 


EAP) = [|È nao) + È neco) Awas . 
The variance of 4f/Ap can now be expressed as 
a eae an E 2 
yearn) = [EAD + fern) 2 nee] ac - E. 813 


This expression proves that a correlated sampling procedure has indeed a finite 
relative variance for Ap — 0, as long as f and g and their derivatives remain 
finite within the integration interval. Experience has shown that this condition is 
satisfied in many practical cases. In situations where a finite integral is not 
guaranteed, it is possible to transform the problem as discussed later. 

If only one of the functions f(..) or g(..) depends upon p then equation 3.9 
becomes either 


" (x; p+ 
Bap = [EER - iawa , (83.14) 
or 
= [Ax p+Ap) r 
Bian = [ESSR - 1| espa . (83.15) 


For both special cases a finite relative variance depends again on the condition 
that the derivatives of for g be finite in the whole integration space. 

In this perturbation scheme it is an essential requirement that the likelihood 
ratio g(x;pt+Ap)/29(x;p) in equation 3.9 remains finite in the whole integration 
space. Possibly it should be close to unity to keep the variance of the differential 
effect small. 

In cases where these conditions cannot be satisfied the author proposes a 
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technique called Multiple Correlation, It consists in the construction of a 
reference function go(x;p) from which the random variables are sampled. The 


reference function gg must be finite in the whole integration domain of both 
g(x;p) and g(x;p+Ap} so that 


BGP) < oo and SP+4P) < o% 
8o(x; p) 80(x; p) i 


We define a new expectation 


E(fo) = | flxip)dGy 


where ọ (x;p) = f(x;p) and dGg = go(x;p)dx. It is subtracted from and added to 
equation (3.3) 


[E(f*)-E(p)]+[E(e) - E(f)] = E(Afa )+ E(Afg) . 
The uncorrelated estimate of Af, is 
E(Afa) = Elf") - Elp) = |, flxip+ Ap)aG*— | fix: p)dGo . 


Similar to (3.8) it follows that 


dG’ = &Riv+4P) ag, 
R0(x; Pp) 


and in analogy with (3.9) the correlated estimate becomes 


= s xpt 5 
B(Afe) = [faep + Ap) EEEE — fx:p)|dG 


In the same manner Afg can be evaluated: 
E(Afp) = Elp) - E(f) = |, flxip)dGo- |, foxp)dG. 
Replacing dG by dGo multiplied by the likelihood ratio g/g 


dG = BP) 4G, , 





R(X; D) 
the estimate of the correlated Afg becomes: 
a(x;p) l 
E = 3p) — flr; dGo . 
(Afa) Jy P) = BEY Teg | Oe 
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Adding up the two estimates of Af, and Afg results in: 


g(x; Pp + Ap) gx: p) 


a Saa a na s 3.16 
80(x; p) Kaip) LANO (2.46 


F(A) = fæ» + Ap) 
The function gy (x,p) can be chosen freely, but it will depend to a large ex- 
tend upon the problem to be solved. gg determines the variance of the procedure 
and may therefor be subject of optimisation. The general conditions for gg are: 
- 8qis a probability density function satisfying the condition g9(x,;p)>6 for all x, 
- Gof{x;p), the integral of go(x;p) is known analytically and increases monotoni- 
cally from 0 to 1, 
- either function Gọ(x;p) can be inverted analytically or a gg-distributed random 
variable is available, 
- both ratios g(x;p+Ap)/go(x;p) and g(x,p)/go{x;p) should be as constant as 
possible in the whole integration space. 
Typical, but not necessarily optimized, examples for choosing gg are the 
arithmetic mean 


8o(X; p) = [8(xip + Ap) + g(x; p)] /2, (3.17) 


or the maximum of either function g* and g: 


et a) 2) (3.18) 
J p dx max[g(x;p + Ap), g(x;p)] 

This latter reference function looks particularly attractive, since it leads to 

likelihood ratios close to unity. It can easily be generated if piece-wise constant 

functions such as probability tables are used. A specific example of this function 

is described in Section 5.4. 


IV. DIFFERENTIAL OPERATOR SAMPLING 


4.1. Basic Considerations 


It is the scope of differential Monte Carlo to calculate linear and, if requi- 
red by the problem, higher-order derivatives. These derivatives of the target 
quantities refer, as in the case of correlated tracking, to the parameter vector p. 

The derivative of the expectation will be formulated for the general case, in 
which the mean-value of f(x,;p) is calculated for a variate with the pdf g(x;p): 


r= Ef) = [pes pets) dx = [ifp aG . (4.1) 
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For the derivative with respect to p; it follows: 


E AE) = itema + frond). 


This expression will now be interpreted in terms of a (stochastic) sampling 
procedure. In a first attempt one would probably intend to integrate the two terms 
of the right hand side independently. But in a simulation analysis of a compli- 
cated multidimensional problem (which Monte Carlo is actually made for) it 
would lead to a substantial increase of the programming effort and computer 
time. In most cases it would, for these reasons, clearly not be practical. 

An easy transformation makes it, however, possible to express the estimate 
of the derivative in a form which facilitates the sampling procedure. To this end 
we replace dG/dp; by 





dG 1 
ay) g(x; p) a S (x; p) dx = 7E PET = B(x; p) B(x; p) dx 


wer 
a Inl g(x; p)) dG , 
and introduce this expression into (4.2) 


or < o : a l l 
Opi Jase Inf fix; p)] + ap, In{g(x, pi) fx p)dG . (4.3) 


The advantage of this transformation is the fact that the term /(x;p) appears 
explicitly in the formula. This allows for the simultaneous calculation of the 
integral (4.1) and its derivative (4.3). In the scoring procedure it is only neces- 
sary to multiply the function value of f(x;p) by the function value of the 
derivatives in the parenthesis. Furthermore, it follows that for sampling the 
derivatives the same variates are used as for the calculations of the original 
response r. A Monte Carlo estimate of d?/dp; is then calculated as 


=> ire Inf, (x; p) + Snax; pi} f(x; P) , 


where f, stands for the value of the function f and dlnff,(x;p)J/dp; for the value 
of the derivative in the v-th trial. 
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V. PERTURBATION ANALYSIS OF LINEAR MATRIX EQUATIONS 
5.1 Solutions of Linear Matrix Equations by Random Walk Techniques. 


A set of linear equations 
x=Px+a (5.1) 


of order m can be interpreted as a random walk process if the matrix P describes 
transition probabilities in a system which can assume a set of states from / to m. 
If the probability for transition from state i to the next state j does not depend on 
the past, i.e. on the path by which j was reached, the system is said to be 
Markovian. A typical Markov chain will start in some state lọ with subsequent 
transitions through a sequence of states I), Ip, ..... and finally a termination in 
state J, yielding a series of integers yy = (lp, lj, Ip,....1,) where 1,€ {1, 2, .... m}. In 
Monte Carlo terminology such a chain is often called a history. 
After a start with probability 
P ( lo =i ) = aj; 


the successive states are connected by the transition probabilities 
Pll = i [y= j n < k) = Pij. 


and the termination probability 


P(k=n/l,= i) = p;. 


The p; j being elements of the matrix P (dimension mxm) are the transition 
probabilities from state j to i, for which the following conditions must be 
satisfied: 


py 2 0, max{Spy) = l Ya; = 1 and Dj = 1->py = 1. (5.2) 
i i i 

The last condition is the terminating criterion for a chain of events. There must 

be at least one p; for which p;</J is met. Furthermore, only matrices P with 

spectral radius p(P) < 1 are allowed. (The spectral radius of the matrix P is the 

largest modulus of all eigenvalues of P. Thus it is the radius of the smallest circle 

in the complex plane which contains all eigenvalues of P.) 

For each state i we define a variable x; which is the average number of 
entries into state i. The state i can be reached either by an event starting in i with 
probability ajor by a transition to i from state j with probability p; j. The mean 
number of transitions from j to i corresponds to the product p; — j% 

It is worth noting: the solution of this system of linear equations by the 
random walk algorithm described here requires that the pj are interpreted as 
transfer probabilities from j to i leading to the "backward" notation p; e- j 
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All events in state i are the sum 


Xj = Pie 1X tPie2X2t... +PiemXmt G. 


The expected number of transitions from all states to all other states are then 
taken into account by the matrix equation (5.1). 

The solution of this system of m linear equations by a random walk 
procedure requires the following additional considerations. If P") denotes the 
probability that the n-th transition (in the sequence) occurs to state i 


P{) = P(l,=i/n-l <k), 
a recursion expression for P{”) can be formulated 


PP = > pP?  forn21 and P(O = P(ly = i) = a;. 
j 


It stands for the probability that transition n enters state i provided P; (n-I)is the 
probability that the (n-/)-th transition occurs inj summed over all intermediate 
states j. With this recursion the number of entries into state i denoted by x; can 
now be interpreted as being the expectation of (x;), where 


m m m 
a; + > Pua, + > > PipPigh % +... (5.3) 
l =] 


h=1h=1 


x) 


G+ > >... > Pin + «Pay (5.4) 
kel h 


l 


= a; + (Pa); + (P?a);+..... (corresponds to the 
Neumann Series !) 


= a; + Pla; + (Pa); + (P?a);..... ] 
=a;+ P(x). (5.5) 
Written in matrix notation 
(x)= P(x) +a (5.6) 
is equivalent to equation (5.1). 


The sample mean &; of all chains of events (histories) yy = (lọ, ly, lz, e-sely) 
is the normalized sum of all entries into state i: 
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Hkh) 


2, = 7 2 abs (5.7) 


h=] n=0 


with Kronecker's symbol 6, = 0 for i # j and 6, = J for i = j. H is the total 
number of histories and k(h) < œ the length of the transition chain y in the h-th 
history. 

The estimates of X; converge to the exact solution of the matrix equation as 
the number of histories increases (see Fig. 5.1). 


5.2 Perturbation Algorithms solving Linear Matrix Equations by Random 
Walk Techniques 


It has been shown (13 p. 85) that a matrix equation 
x=Qx+b, (5.8) 


satisfying less stringent conditions than those required by (5.2) may be solved by 
"mapping" it onto a random walk as described above. A correlation technique 
which makes this possible, is also the base for the perturbation analysis of Linear 
Matrix Equations discussed in this chapter. Using the same notation as previous- 
ly, we define two likelihood ratios 

#0 for py #0 

&; = bila; and ny = qij /py requiring 
= 0 forpj=0, 


and 


[]Q||= max [Se < 1. 


To solve equation (5.8) we again follow random walks y, = (lọ, 1}, lz, . «lp» 
. , 4) connected by the transition probabilities py. But in this case the random 
walk is accompanied by a weight factor w, being the product of all T; associated 
with the states it passed through before entering state n. 


Wr = Wr) = Thl © + + Thh hh A, (Wo=a,, ) - 
Instead of just counting the transition events as in (5.7) the score of x; up to the k- 


th transition is the sum of those weight factors which refer to transitions into 
state i. It leads to the sample mean 
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Input - number of equations: m 
no. of equations to be solved simultanuously: k,,,,, 
elements of the kpa; Matrices Q: C APAE 
elements of the kay Vectors b,: b, ; 
number of random walks: Apax 


Construct the Reference Case: P; = (x ij) Imax? a; = (x 
Determine the Likelihood ratios: pij pi; P, 


kij 





Figure 5.1: Flow-chart for the simultaneous solution of systems of Linear 
Matrix Equations by the Multiple Correlation algorithm. In constructing the 
reference case it is assumed that all q; ; ; are positive. 
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H k(h) 
re I Ò. 
i-F Wr Pil, » 
h=1 n=0 


To calculate the second moment, the score of x; must be summed up and 
squared for each history separately: 


a 1 H 1 k(h) 2 
7 = ape òu) ; (5.9) 


rendering the sample variance 


A 


VG) = xP -4 , (5.10) 


To facilitate the reading of the formulae we adopted the convention of 
indicating pdfs from which samples are taken by Roman letters (italic) and 
likelihood ratios (determining weight factors) by Greek letters. 

The frequency with which these weight factors are chosen corresponds to 
the random walk distribution governed by the matrix P and the start distribution 
a. In analogy to expression 5.3 the expectation of x; for all chains of events 
starting with the distribution a), @3,.. A and reaching the final state k is: 


Qj) = aa; + Dawe Pinan + > mee PibPhy%, + 
h=lh=l 


=a;a; + > va er ~My Oy Pi, - -Puh . (5.11) 
k=l hk h 
In perturbation analysis the interest is focused on the difference of two (or 
more) estimates %;* and £; being solutions of the two matrix equations 


a= Q*3™ 4+ b* and è= OF+b, (5.12) 


which are of the same dimension and may differ only slightly from each other. 

A straightforward approach would be to perform two independent 
calculations with Q*, b* and Q, b where Q* = Q + AQ and b* = b + Ab. In the 
case of two independent and uncorrelated computer runs the variance for the 
difference becomes 


V(Ax) = V(x*) + V(x). 


As V(x*) and V(x) are always positive quantities, the relative variance 
V(Ax)/ | AQ| is unbounded as | AQ | ~— 0 similar to the considerations of 3.1. 
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Applying a correlation technique it is possible to avoid this shortcoming 
and to obtain a finite relative variance even for very small changes of the matrix 
Q and/or the start distribution b. Correlation is achieved by mapping both 
systems Q*, b* and Q, b onto the same random walk process which we call the 
reference system. It is governed by a transition matrix P and a vector a. 

The reference system can be chosen arbitrarily, provided it satisfies the 
condition (5.2). If all qij >0 and gy 0 it can be constructed as the arithmetic 
mean of two (or more) systems Q*, b* and Q, b. 


Py = (ay + Qy)/2 , (5.13) 
or 
max (lafl.laisl) 1 
Py = = D5 Maile layl) 5.14 
"max (afl, \ai) 2 2 (ail + lal iia 


The reference system can also be identical with one of the systems Q*, b* or 


Q, b. 


The likelihood ratios determining the solution of equations (5.11) are 
Tiy* = gy*/py and o;* = bj*/a; 
y= qij / Pi and Qa; = b;/a;. 


As in 3.2 these ratios should be as close to unity as possible. With this assump- 
tion both systems follow the same random walk, one with the product of weights 


n 
* * * EEE * * Ban * 
we E way My = hy Msg «++ Tah = [ [ai (5.150) 
i=l 


and the other with 


n 
Wa = Wh 1g = Wy yeaa Tahy = | [i (5.150) 
ia 


where 7%), = Q, and my = a), . 


To simplify the further considerations we assume that in Q* only the 
element 


aiy“ =n t Agy ’ 


is different from Q and that Q, b serves as the reference system. With these 
assumptions all matrix elements of P are py = qy and all likelihood ratios Tj, n*i 
and of are 7 except: 
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+ 
nt, = a e ua- 1+ Apn , (5.16) 
in 


where AP.» is the relative change of the element qy» 
The score of the difference Ay; in a single history (chain of events y4) is: 


k 
AX; (y k) = >| THe + Apin 5, Ôn, 1) Mh 7 Tau. ily» 


ILy=1 


k 


F >| Io + Apin 541, 5y1,..) — 1 Fo . (5.17) 
v=] vel 


If Ap,,< J the following approximation is possible: 


Tle + APin 5u,Ony 5) ~ 1+ AP Òu, Ôn- (5.18) 


Introduced into equation (5.17) one has: 


k n n 
Ati (yy) = Ap >. (3 2ee fs : (5.19) 
n=1(\v=l v=l 


After moving Ap,, to the left hand side and summing over all histories H (where 
k(h) is the total number of transitions in the h-th history) the sample mean 
(Monte Carlo estimate) of Ax;/Ap,, becomes 


Hkh) 
se” HE Set] japa ecm 


=1n=] 


and in analogy to (5.9) its second moment is 


Ag 1 Ë 
s= a 


With the assumption that my =J(for ij # m) and all œ=] the variance of the 
relative perturbation is: 


V(AR; ) ] H (khp n 2 ] H kh)j n 2 
im, a DA on. | -4È S Soutul . (5.22) 


h=lin=1 


h) 


i 2 
aad (È > 51,9 mfu] , : (5.21) 
v=] 








As this expression is finite for all terminating random walk procedures, it 
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Input - number of equations: m 
elements of the Matrix Q: q. 
elements of the Vector b: b, 
number of random walks: Anay 


Construct the Reference Case: Pi = q! (x l); aj 
Determine the Likelihood ratios: 1; j= aj P; $ 


i 
Calculate cumulative probabilities: &, = >F r PES B7 
n=] 


G,= lē, l, =° for k „i,j all from J to m 





Figure 5.2: Flow-chart for calculating a system of linear equations including the 
first order derivatives (sensitivities) with respect to the elements q;; of the 
matrix Q. (note that G; = G,_;) 
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proves that the multiple correlation technique proposed here, renders a bounded 
relative variance for arbitrarily small perturbations. 


5.3 Differential Operator Sampling: Gradients (Sensitivities) and Higher 
Order Derivatives 


Equation 5.20 yields another important result. The fact that 


Ox, (=) pý 
= lim |——}, wherei,j,ke {1,....m} 
P; 


4p 0 \Apy 


allows to calculate the elements of a Jacobian of the %; with respect to all 
transition probabilities gj-s by the unbiased random walk algorithm described 
above. A different and more general method of calculating simultaneously the 
X;-s and the derivatives of the %;-s is, however, shown in this paragraph. To this 
end we express the expectation of the x;-s of equation 5.8 by their Neumann 
Series which describes the random walk procedure rendering the solution of the 
problem: 


eo m m 
+) 2 Site nbn - (5.23a) 
k=Ih=1 (=l 


To simplify the further considerations this equation is rewritten in the following 
form 
k+1 


(x) = b; + >: a | Tau. m (5.23b) 


li n= 
where q= b and 4), = Qt, . 


Now we assume that yz transitions 4-7 were encountered in the chain yg = 
(li, . . . lp), where we {0, 1, 2, 3,..}. The procedure of differentiating (x; with 
respect to the element qıy becomes more transparent if we take the q,»-s out of 
the product term so that only thus q-s remain for which /,€-l,, 7 # =n. This is 
formally expressed by the modified upper product boundary k*+1= k - uy+l 


k+l 


&) = + 2; s DA BETS 
k= 4 n=1 


Differentiation with respect to the element q, leads to 


k+l 
($4) - hie Leeda lann. (5.24) 


=I} T 
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Putting q#,, back into the product IT this ear can be rewritten as 


($4) - DD ae Tew. 1 (5.25) 


Again, as in 5.2, we generalize the Markovian chain process by mapping it onto 
a random walk system governed by the matrix Ilp; ll. Finally, with dg/q = dp one 
obtains: 


($4) = 2D Ifran Movs. (5.26) 


k=l 


where Pro = ap Mnl = Ors Pia = Pi, and Mipi = Ti- 


This equation can be interpreted in terms of the same random walk process 
by which the expectation (x;} is obtained. In the chain y, a counter 4, must be 
increased by / each time a transition from y to + occurs. Simultaneously the value 
of the counter is multiplied with the weight w,,,; (see 5.15) and added to the 
score from which the sample mean of the derivative of x; with respect to qis 
calculated: 


A 
Ox; _ 1 ae 3,8 
Pun = ad, 2 (È il, Onl, JE lar | Sit . (5.27) 
It says: - if the first transition occurs from the source state lọ= 4 into /)= + or 
from a later transition state /,_; = 9, 1,=« and if at the same time i=« then Iwy is 
added to the score of 0%/0q,,; - if later on in the chain of events Ypa second 
transition from /,_)= to l= + occurs and if at the same time i=: then 2w, is 
added to the score of d%; 104.y and so on. The expression is identical to (5.20). 
Figure 5.2 shows a flow-chart for a Monte Carlo procedure by which the 
derivatives of all x,-s with respect to all Ty (Or py) can be estimated. The results 
are listed in m two- dimensional matrices Gy = Ilgglly 5 where i,j, k = {1,2, .. , m} 


By 
Fw. (5.28) 


To obtain the second-order derivative expression (5.24) is again different- 
iated with respect to q,, leading to the expectation 


ott 
(i) PE. Zaa- Dag ICO 


Replacing ðą?/ą? by dp? and following the procedure described above we obtain 
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ax, k+l k+l 
(z) PH Did - mwa] [nni ] Puni. (5.29) 


k=l h l; 





and the corresponding sample mean 


H kA (Ty an 2 n n 
= HÈ ba P = Sut. fs. Oi, . (5.30) 


In a similar fashion the cross-terms can be determined if derivatives for two (or 
more) terms q,, and qg; are required: 


oo ) = oe Derd Jau Tote (5.31) 


=l k l; 


S| 


LY 


2 
Pin 


In this case the sample mean is 


A EEA om 


h=1 


5.4 Examples 


To demonstrate the functioning of multiple correlation and differential 
operator sampling described in 5.2 and 5.3, the author wrote two Fortran 77 
programs following the schemes developed in Figure 5.1 and 5.2. The programs 
use dynamic storage allocation so that the number of equations is only limited by 
the available memory space. Experience has shown that scores must be collected 
in double precision (64 bit) arrays. In the case of differential operator sampling 
the execution time can be substantially shortened if the score of the Jacobian is 
calculated by a vector processor. 

Example 1 consists of a system of 3 linear equations. The random walk 
solution can easily be verified by a deterministic method such as Cramer's Rule. 


x12 | =| 30.30.10 30|- 


X11 (3 50 i X11 (20) 
Xj3] \20.20.20 40 


X13 


In the following perturbation analysis carried out either by multiple corre- 
lation (a) or a first-order Taylor approximation (b) two small parameter changes 
in qyare considered. In one case qj, is increased by 7% leading to the system 
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X2) 2525 50 .08\ [%2,1 30 
X22)= 30 30 10 X22 +1.30 F 
x3] \20 20.20/\y,,] \40 


and in the other case 433 is decreased by 7% leading to 


X37 25 50 .08\ | *3,1 30 
X32 = | .30 .30 10 X32 +{ 30}. 
.20 ,198 .20 40 


X33 X33 


a) Multiple Correlation: The samples of 50000 random walks provided for 
the three equations (k = 1 to 3) the results shown in Table 5.1. 

b) Differential Operator Sampling: In this example simultaneously with 
the unknowns x;, X2, X; also their derivatives (Jacobians) with respect to the 
parameters qj are calculated. The perturbation of 7% is so small that a linear 
Taylor expansion using the sensitivity coefficients dx, ldqy provide good pertur- 
bation estimates. Applying the scheme of Figure 5.2 the results listed in Table 
5.2 were obtained. 


HO. OF RAHD.WALKS: © 56668 


aa LISTING OF OUTPUT 999 EXEC, TIME= 7.31800E+01 [s] 


THE UKKHOHNS X(K) ARE : 
X(t)= 1.272 X(2)= 1.497 X(3)= 1.106 


NO. OF RAHD.WALKS: © 56086 
EXEC. TIME= 8.39600E+01 [s] 


JACOBIAN OF X(1) WITH RESPECT T0: 
RHO(T,1)  RHOCI,2) — RHO¢T, 3) 
I= í 66798 1.494 19848 
I= 2 61088 54006 19642 
I= 3 11176 „09080 09758 
JACOBIAN OF X(2) WITH RESPECT T0: 
RHOCT,1)  RHO(CI,2) RHO¢T, 4) 
I= i 32682 59052 09238 
I= 2 96868 77624 25632 
I= 3 10516 09966 69298 
JACOBIAK OF X(3) WITH RESPECT T0: 
RHO(T,1) © RHOCT,2)  RHO(I,3) 
I= 1 124232 43694 07046 
I= 2 36398 39066 10954 
I= 3 37564 32856 31998 


THE UHKHOWHS X(K,T) ARE : 

X(K,1) X(K,2) K(K, 4) 
Ki 4.2745 1.1360 1.1062 
S.D. £0070 +.0074 $6642 
K= 2 1.2782 1.1392 1.1686 
K= 39 1.2786 41,1351 1.1029 


THE DIFFERENCES DX(K,)=X(K,)-X(1,1) ARE: 
DK(K,1)  DX(K,2) KCK, 3) 
K= 2 46678 46327 00243 
S.D. +.00011 £00087 +.60004 
k= 3 = -, 08891 = -,88889 = - 88328 
S.D. +.00003  +.0aaa3 = +. 00004 





Table 5.1; Perturbation analysis of Table 5.2; Differential Operator 
three linear equations carried out by Sampling for a system of 3 linear 
Multiple Correlation. equations. 
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AX; (0x; /0p 11) Ap AX; Ax" 
mult. corr. analytical 2 indep. M.C. runs 


6.70E-3 6.68E-3 6.66E-3 0 .9E-3 
+0.11E-3 +9,9E-3 
3.27E-3 3.27E-3 3.20E-3 -2.1E-3 
+0.07E-3 + 10.4E-3 
2.43E-3 2.42E-3 2.46E-3 0.6E-3 
+0.04E-3 +5.9E-3 


Ax3;  (0x;/0p32)(-Ap) Ax" 3) 


—9.08E-4 — 9.25E-4 2.3E-3 
+9.9E-3 

— 8.87E-4 — 8.64E-4 5.6E-3 
+ 10.5E-3 

— 3.29E-3 — 3.28E-3 -9.1E-3 
+5.9E-3 





Table 5.3: Comparison of Perturbation Estimates 


Discussion of results: In the second and third column of Table 5.3 the 
results from multiple correlation (Ax; = Xy; - Xj; ) and a linear Taylor approxima- 
tion (Ap = 0.01) based on differential operator sampling are listed. Comparing 
these values with each other and a deterministic solution listed in column four 
shows excellent agreement of the perturbation effect. On the other hand it can be 
seen that the results from two independent Monte Carlo runs listed in the last 
column have a standard deviation which is much larger than the differential 
effect and up to two orders of magnitude greater than that of the correlated 
calculation. 


Example 2 deals with multiple correlation estimates using reference 
equations constructed in two different manners. The first is based on the 
arithmetic mean as in expression 3.77 and the other on the slightly modified 
procedure 5./3. (Note the difference in the likelihood ratios in Table 5.4 and 
Table 5.5.) In both cases we solve the first equation of the previous example 
assuming that in the perturbed case q( 1,3) changes from 0.08 to 0.0. 

With the reference equation based on the arithmetic mean 50000 random 
walks render the results listed in Table 5.4. The same problem calculated with a 
reference equation based on the procedure 5.14 leads to the results of Table 5.5. 

It is interesting to note, that the second type of reference equation improves 
the efficiency of calculating the perturbation effects [DX(K,I)] almost by a factor 
of two (e.g.: 2.282 /1.742 = 1.7), For completeness the same perturbation effects 
are also calculated by a linear Taylor expansion using Ap = -J and the corre- 
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sponding elements of the Jacobian calculated in example 1: (Ax; 1P; 1 3 ) Ao = 
—1.885E-1, (0x5 1092 7.3) Ap =- 0.923E-1, (dx3 19P3,1 3) Ap =- 0.704E-1. 


CUMULATIVE PROBABILITIES OF P(T,J): 
P(T,4) P¢T,2) PCT, 3) 


HOOD RATIOS PI 


I= 1 
I= 2 
I= 3 
LIKELTHO 
PI(1, 


I 
I 


I= 1 
I= 2 
I= 3 
LIKE “pe PI 


PI(2,1,4) PI¢2,1, 


1 
( 
I 
i 
i. 
L. 
( 
I 
i 
i 
i 


I, 

i. 

i. 

i. 

LTHOOD 

I 

I i. 
I i, 
I i 


HO. OF RAHD.WALKS: 50000 
EXEC, TIME= 4.47600E+01 [s] 


THE UNKNOWNS X(K,I) ARE : 
(Ki)  X(K,2) 
K= i 4.2699 1.1364 
S.D.  +.0078 +.0079 
K=2 4.0972 1.0550 


K(K,3) 
1.1047 
+, 6045 
1.8451 


DIFFERENCES DX(K,1)=X(K,1)-X(1, 1): 
DX(K,4) DXCK,2) DX(K,3) 

k= 2 -1727 -.08i40 -.06356 

S.D. £0048 4.00358 +.66228 





Table 5.4: Perturbation analysis 
of 3 linear equations carried out 
by Multiple Correlation and a 
reference case based on the arith- 
metic mean. 


CUMULATIVE PROBABILITIES OF P(T, 3): 
P(T,4) P(T,2) PCT, 3) 
5 04 
4 (d4 
1.6 4 
RATIOS PI(1,1,J): 

) PICL,1,2) PIC,1,3) 
1.1176 
1.1176 
1.1176 

J): 

PI(2,1,9) 
6.8 
1,1176 
1,1176 


il 

6 1.6 

4 1.6 

6 1.6 

ELTHOOD RATIOS PI(2,1, 
PI(2,1,4) PI(2,1,2) 
1.6 

1.6 

1.6 


HO. OF RAND.WALKS: 50000 
EXEC, TIME= 4.46700E+01 [s] 


THE UNKNOWAS X(K,1) ARE : 
X(K,1)  X(K,2) 
K= 4 4.2720 1.1369 
S.D. 4.0073 +0076 
K=2 14.0955 1.0499 


X(K, 4) 
1.1168 
+,6045 
1.0450 


DIFFERENCES DX(K,1)=K(K,1)-X(1,1): 
DX(K,1) DX(K,2) DX(K,3) 
-1764 -.08714 -.66582 
+.00275 +.00174 


K= 2 
S.D. £0036 


Table 5.5: Perturbation analysis of 
3 linear equations carried out by 
Multiple Correlation and a refer- 
ence case based on the procedure 
DIZ: 


These approximations show a small, but systematic deviation from the "exact" 
solution obtained by the correlation technique. 

The examples shown above were calculated on an Archimedes 440 
(ACORN Computers Ltd., UK) home computer with an ARM2 processor and a 
FP Co-processor. 
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IV. FREDHOLM TYPE INTEGRAL EQUATIONS AND PERTURBATION 
ANALYSIS 


6.1 General Considerations 


The previous section dealt with discrete transition probabilities p; — j and 
source distributions aj. If the matrix P is replaced by continuous functions of the 
form K(x «— y) and the source vector a by a distribution function a(x) then 
equation 5.2 becomes a Fredholm type integral equation of the second kind: 


fix) = [ax (x — y)f(y) + a(x) » (6.1) 


where K(x y) describes the transition properties in the (multidimensional) inte- 
gration space R. 

This integral equation can be interpreted as an infinite matrix equation for 
which the random walk procedure described in Chapter 5, approximates the ex- 
act solution as the number of trials increases, provided the criteria of convergence 
are satisfied. 

The random walk procedure can be interpreted in terms of the following 
Neumann series being a solution of this integral equation: 


fix) = Pfa) = D fy- [RGR thy) K thy — in-i) os 
n=0 n=0 


eee . K(uy — u,)K(u, & up) dü, esse dup ’ (6.2) 


or in shorter notation 


co n oo n 
f(x) = cae > fel [du Kes 1 € u)= x le fl [du;K; , 
n=0 i=0 n=0 i=0 
where u,,; =x and 
folx) = [dual — up) . 


is the sum of all direct contributions (transitions) from the source, 
fi(x) = [auf duo K(x — u;)a(uy & ug) 


the sum of all contributions undergoing an intermediate transition, and so on. 

Since there is no essential difference between solving equations 6.1, 6.2 
and 5.2 by random walk techniques, the perturbation algorithms developed in 
Chapter 5 can readily be applied to Fredholm type integral equations. 
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6.2 Correlated Sampling Procedures 


We assume that the kernel K; depends on a set of parameters represented by 
the elements of the vector ø which undergoes a possibly small change 4p, The 
difference 


Afix) = f¥(x) - fx) (6.3) 
where 
f(x) = D fe h | Lem K (ois wsp) 
n=0 i=0 
and 


f'(x) = Si fa E | Jdu; Ku, € u;; p+ dp) 
n=0 i=0 


can be calculated by a correlation technique as described in the previous chapter. 

To this end again a reference case characterized by the kernel C;= C(u;,.), 
U;; Po) is correlated with the kernels Kj" = K(u;,),4;; p+4p) and K; = K(u;,),l;; 
p). As outlined above, the equation with the reference kernel C; can either be 
identical to the unperturbed or the perturbed case or it can be constructed such 
that the variance of the differential is minimized. This choice depends on the 
problem to be treated. Usually it will be in favour of a reference kernel 
constructed according to the examples 3.11 or 3.12. Defining the likelihood 
ratios w; = K;/C;and w;* = K,*/C; we obtain 


fe) = > i ae i {IT { IQ du) (6.4a) 


fz) = Zh 7 IIG du), (6.4b) 


i=0 


The expression J]C; du; describes the random walk procedure of the reference 
system, while Iw; and Iw; refers to the product of the likelihood ratios 
correlating the unperturbed and the perturbed equation with the reference 
equation. 

In the most general case the w; and the w;* must be determined for each 
transition separately. Usually they are simple analytical expressions, which can 
be computed with little effort. 
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6.3 Differential Operator Sampling 


Another, quite different way of estimating perturbation effects, is the 
differential Monte Carlo method. It is based on a general or multivariate Taylor- 
series expansion of f(x): 


fixo + Ap) = flxip) + TEP) Ap, + y SEP aoe. . 65) 
k 


This method implies the estimation of parametric derivatives. The estimators for 
the various derivatives of f(x) can be derived directly from the Neumann series. 
Equation (6.2) differentiated once with respect to one element p% ofp leads to: 


= YI, du, fete ara SIG ), (6.6) 


where the brackets (...) stand for the regular Monte Carlo game and {...} for the 
estimator of the scores rendering the first order derivatives of f(x). 

Expression (6.6) is very similar to (6.4). Only the multiplicative term []w; 
generating the weights of the perturbed history has been replaced by an additive 
one containing a sum of first-order derivatives of the transition kernels. The two 
terms are however closely related. Inserting (6.4) into (6.3) gives 


Afix) = eas ; IRIG - [lm Ie au). 


Assuming that C;= K; and therefore wj = K;"/K; and w; = I we get 


Aftx) = Že Ja ae L Ifj /K;) - (fix au) ; (6.7) 


For a small Ap, the kernel Ķ; can be approximated by a linear Taylor series 
expansion 


Kt ug Ba 
i i apy P k 


which allows for a further approximation if 4p, is small: 
z z dK; 2 OK; 
K‘ /K)) - 1 = (r+ Si. J-1=4 — 
jeo- = [t+ Sm) = ana 


i=0 


Introduced into (6.5) we get 
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dhe ae El a Sri {IIs au) (6.8) 


which is the same as (6.6). 

It is interesting to note that the estimator of the first derivative is the sum of 
derivative operators taken at each transition point in phase space independently 
of the actual size of the perturbation 4p,. This is quite different from correlated 
sampling where in (6.4) the perturbation dp, is an essential parameter of the 
estimator itself. 

The second-order derivative of f(x) in Expression (6.2) is given by 


(Sac) © 3 ea NE) 60m 





3 as 
= Dade: : [au 


or 


H = Sdn. paulo * Sook (ga) 


where the expression in the brackets {...} stands for its estimator. It consists of 
the square of the estimate for the first-order derivatives (of the transition kernel) 
plus the sum of the estimators of the second-order derivatives calculated for all 
transition points. 

Similarly the estimator for cross-terms can be derived: 


E5 = 2 J ttn es [au 1 


= OK; K; OK; n OK; n 
d lkr Ki dpadog K; K; Da K; ral - Leal ls). (6.10) 


i=0 K.: =0 i=0 


(fi) . * (6.9b) 








as well as higher-order derivatives of f(x). 


96 H. RIEF 


VII. PERTURBATION ANALYSIS APPLIED TO THE BOLTZMANN 
EQUATION 


In the following it will be shown how the perturbation and sensitivity algo- 
rithms outlined above can be applied to particle transport problems constituting a 
typical Markov chain process. All the methods described in this chapter have 
been tested in practice and have been added to existing particle transport codes, 
either by the author himself or others. 


7.1 The Neutral Particle Transport Equation 


Of the different forms of equations describing particle transport in phase 
space the integral representation (Fredholm integral equation of the second type) 
is chosen here for reasons of consistency with the previous derivations: 


p(x) = [ Kees: phply)dy + S*(x). (7.1) 


where: x and y are phase space vectors in R, 
w(x) denotes the particle collision density in x, 


K(x,y;p) is the transition kernel defining the probability of a transition 
from y to x which depends on a set of parameters p, being the subject of 
perturbation analysis, and 


S*(x) defines the first-flight collision probability for a source particle in x. 
(see expression 7.3c.) 


A classical solution to (7.1) is given by the Neumann series. It has the 
advantage that it can be directly interpreted in terms of a random walk problem 


w(x) = > y,(x) 


n=0 


y(x) = > fdu, iterii [duo K(x, thy; P)K (Uplly 1; p)... Klun; p) » (7-2) 
n=0 


or in shorter notation: 


y(x) = LI. oe in [ du; , 


where 
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Yo(x) = | pi K(x,u9;p) , 


is the direct source contribution, i.e. the collision density of particles starting in 
Up and reaching the collision point x without being scattered. 


W(x) = du, Jj dito K(x, p)K( Uj,Up;p) » 


stands for the contribution of particles which collide in x after being born in up 
and having undergone one scattering process in ty; 


W(x) stands for the twofold scattered particles and so on. 
It is common practice to factor K(x, y; p) into a product of a collision and 


a transport kernel, in which the six-dimensional phase space vector x (and y) is 
split into space and energy-angle components r and E. For i>0 we write 


K; = K(rig7, Biggs ri Ey p) = CCE jx), Ev ri P) Trig ru Eng p), (13a) 
while 
Ko = T(r}, Tor Ey, p) S(ro, E). (7.3b) 
and 


s* = [,28:) aro T(r) ,7oiP)S(ro,E}) $ (7.30) 


where the kernel C identifies the survival probability and the new energy and 
direction vector determined by a collision event, T the exponentially distributed 
intercollision distance, and S the particle source term as a function of space and 
energy. 


To perform perturbation analysis the previously developed algorithms are 
adapted to the particle transport equation. This topic has been dealt with in detail 
by the author and others (see References). Depending on the problem to be 
solved, either correlation techniques or differential operator sampling may be 
applied. 


7.2 Correlated Sampling Procedures 


Similar to the considerations above also in the case of the particle transport 
equation several possibilities of constructing correlation estimates aiming at a 
reduction of the variance of differential effects are investigated. In this context in 
section 7.2.1 typical likelihood ratios are discussed. In Section 7.2.2 an analytical 
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expression for the variance of the relative change in collision density is derived 
for an infinite, one group model. It gives the range for which a finite variance can 
be expected if the material density is changed. The following Sections 7.2.3 to 
7.2.5 deal with the construction of reference equations and perturbation models 
which extend the validity range (finite variance) of perturbation estimates. All the 
schemes described in Sections 7.2.3 to 7.2.5 are for example realized in the 
Monte Carlo program KENEUR (41, 42) and therefor this code will frequently be 
referenced (see also paragraph 7.4). 


As particle transport is described by a Fredholm integral equation the 
correlated collision density w*(x) can be calculated by a sampling procedure 
approximating the Neumann series 6.3. 


w'(x) = ae ft {TT TTR: du; , (7.4) 


where w; = K;*/K; represents a likelihood ratio. 


In the following considerations the capital letters C, T and F denote the 
kernels of the transport equation. In the Monte Carlo procedures described here 
they are probability density functions (pdfs) from which those discrete values are 
sampled which determine the random walk of a particle. All ø, denote macro- 
scopic reaction cross-sections; ø, is the total cross-section of the medium, og, the 
total cross-section of isotope k and ox; is a partial reaction cross-section of 
isotope k where the subscript s refers to scattering. The kernel C includes a 
kernel T(E’, Q’, E; r)y; which describes the direction and energy change for 
reaction type j (elastic, inelastic, etc.) of isotope k. 


7.2.1 Typical Likelihood Ratios in Correlated Sampling. To allow for a 
visible distinction between sample functions and likelihood ratios (weight 
factors) the symbolic expressions for the latter are put between curled brackets of 
the form {...}, e.g. {K*/K}. 


In Example 1 a relative density change Ap/p of an isotope mixture u is 
treated by multiplying all cross-sections in the kernel K; by (1+4p/p),,. 


Since K; = C;T; the two kernels C; and T; can be considered separately: 


te + AP). ae + Ap), 


a 
TE, Q, E- pQr = G, 05 
(P + Apo, (p + Ap),o%, (E, Bi- 12i- 17hy aa 


i.e. the density change does not affect the scattering kernel C;. On the other hand 


Tř = (1+Ap/p),o, exp[- (1+Ap/p).0,5;] « (7.6) 
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If the collision occurs in the perturbed medium along the flight vector it follows 
that 
w; = C T;"C,T;) = (1+Apip),, exp(-Ap o, s;) , (7.7) 
or if Ap/p 4 p and p = 1: 
w= l - Ap, (o,8;- 1). 
Example 2 deals with a partial cross-section change for a single isotope x in a 


mixture of K isotopes occupying M geometrical regions (k=1,2,.....K; m=1, 2,..., 
M): 


(1 Erau] Ok j 
c? =>] > — MIE, Q;, Ei- 1,Q)-157)p; 
i,m — Op + Oy Ore Jm One (E; is i — 199%; iT )j ’ (7.8) 
where 
amk = ( p ET , 


H = index of region in which the isotope k is the subject of a perturbation and 
Ôm = Kronecker's symbol defined as 0if m + y and / ifm = u. 

It is standard practice to choose single events from this pdf. A collision 
with isotope k leads to a weight factor 





= l 7 l+qy 
Ci) LD + Ox Oe) Sm’ (7.9) 
Similarly it follows that 
hg Ou 
x,t 
T z | J am| OM, Je- 2 anx Om,x,t s) ° (7.10) 








7.2.2 Uncertainty Estimates in the Analog Game. For an infinite medium 
problem with only one constant total cross-section o, = 1/A and a collision 
survival probability p the first and second moments of the scores of Ay can be 
calculated analytically. If Aw,, stands for the score of a history which dies after n 
collisions and Ap = Ao, /o, , we obtain 
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n i I 
(dyl) = [ra pa fafa +e- 47A) - n| 
=i j=l 


p" (1 -p) el-5 2) : (7.11) 
j=1 


Iterative integration gives Ay, = 0 for the first moment (k=1), as one would 
expect for physical reasons. The second moment which is a direct measure of the 
variance becomes however 


(1+ Ap)? -itt 
un? = | der | n]ot 1p) > 


and summation over all possible histories leads to 


y| 4L a, Up ne (7.12 
ia) ((1-24p)-(1+Ap)*p] (1-pY (1-24p)-(1+4p)*p ° oa 


where N, stands for the mean number of collisions per history. It is found from 
weighted summation over all collision probabilities, assuming that the probabil- 
ity of making i collisions is (J-p)p +7, thus 


= (-p) jp"! = I(l-p? . (7.12b) 


This example, which approximates a number of realistic cases, shows that there 
is only a limited range where correlated tracking has a bounded variance, i.e. 
conditions for which in in (7.12a) the term [...] >0. 


For sufficiently small perturbations ([dpl < 7) the method leads to a finite 
relative variance of the differential effects contrary to an estimate of the same 
quantity obtained by two independent Monte Carlo runs for which the relative 
variance would go to infinity: 


; l+ 
lim — PN, = (1+ N32 
ice awe il es i (7.13) 


Furthermore, correlated sampling has the advantage that the effort needed for the 


STOCHASTIC PERTURBATION ANALYSIS 101 


perturbed history adds only little to the computing requirements of the unper- 
turbed game. 


On the other hand it can be seen that the denominator may become zero or 
negative for parameter changes exceeding a certain size leading to an unbounded 
variance of the differential effect. In particular large reductions of a cross-section 
(or a medium density) or the complete removal of a medium from a reasonably 
large geometrical region may cause such effects (see 2, 3, 42). 


7.2.3 6-Scattering: a Biasing Scheme to Avoid Unbounded Variances. 
According to Rief (41) it is possible to obtain a bounded variance in quite a few 
of the above mentioned cases, if we add to the medium in question a fictitious 
scatterer in the forward direction. In such an altered game the collision density is 
increased without changing the reaction rates. If the amount of 6-scatterer is kept 
constant in both games a cross-section change of the real medium now applies to 
only one component of the mixture. This biasing technique allows even the 
voiding of large geometrical regions without risking unbounded variances. 


A fictitious 6-scatterer can be added to all geometrical regions to which a 
perturbation factor Ap is applied. The value of the 6-scatterer "05" is specified in 
units of the total cross-section g, i.e. os = qo;, where q is the input quantity. The 
biased cross-section of the perturbed region is then o,* = (1+q)o;,. 


7.2.4 Double Biased Correlated Tracking. It was shown that classical straight- 
forward correlated tracking may suffer from a very large or even unbounded 
variance if one deals with density changes exceeding certain limits. Voiding a 
region certainly falls into this category. A simple procedure to guarantee a boun- 
ded variance, even if the region is voided, consists in the introduction of the 
fictitious 6 -scatterer. And, indeed, a certain class of problems where voiding is 
of importance can be successfully solved by this technique. 


The two biasing schemes described in the following are based on a 
refinement of this idea and guarantee a bounded variance in all applications of 
practical interest. Since in both cases biasing parameters are introduced into the 
reference history as well as into the correlated one we call this method "double 
biasing". One of the schemes is made for voiding (or density changes) in a region 
filled with a material transparent to neutrons, while the other scheme applies 
mainly to strong absorbers such as safety rods, etc. 


Double Biasing in Transparent Media: In the presence of a 6-scatterer the 
intercollision distance of the reference history in the region to be voided is 
chosen from the pdf 


T; = (0,+05) expl- (0, + 05)5;], (7.14a) 
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and that for the correlated history from 
T;* = [o,(1+4p) + o5* ] exp{- [o,(1+Ap) + o5*)]s;} , (7.14b) 

where both parameters ø and 0" can be chosen freely provided they remain 
positive. We make the choice: 

05*=05-0,4p, 

q = 95 /0;, 

q=4q-Ap, 
With this assumption the biasing factor for the transport kernel of the correlated 
history becomes 


{T'i T} =1. 
After determining the site of collision the non-absorption probability p,, is deter- 


mined for the reference history by 


Pna = (Ps+4)/ (1+4), (7.15a) 
and for the correlated history by 
P'na = [ps(1+4p) + q- Ar]! (1 +4), (7.15b) 


Instead of choosing the survival chance for the reference history from the p.d.f 


Pna We multiply in the region subject to perturbation the weight of the reference 
history with p,,, and that of the correlated history with p”,,, respectively. 


C,=ppstq) and Cs=q/(p,+q). (7.16) 


The biasing factors of the correlated history become for "real" material scattering 


C; | (s + q)U +4p) 
=} = — , 7.17a 
Csi Pd + Ap) + q—- Ap ( ) 
and for ĝ-scattering 
4 _ __ s+ 9q- 4p) (7.176) 
Cali [p,(1+4p)+q-Ap]q © ' 


At the i-th collision point the weight of the reference and each of the 
correlated histories consists of the product of all likelihood ratios encountered up 
to that event. The choice of large os (or q) values reduce the variance of the 
perturbation effects but they increase the calculation time. 


Double Biasing in Non-Transparent Media: The "double biasing" scheme for 
non-transparent media is based on the idea of choosing in addition to a ĝ- 
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scatterer a fictitious total reaction cross-section øg which is used to sample the 
intercollision distance between collision points in the perturbed region from the 
pdf: 

T; = 09 exp( - 095;). (7.18) 
Usually og is assumed to be smaller or even much smaller than the actual o, of 
the region to be voided. 


With this assumption the reference history must also be weighted by a fac- 
tor of the form 





0, + 6, 
lit, = r : exp[(09 - 0, - 95) si] » (7.19a) 
i 


where T, Ty are the transport kernels for the real and the fictitious medium with 
the macroscopic cross-sections o, and og respectively, and the collision index i. 


The biasing factor of the correlated transport kernel T* (referring to the 
voided region) is of the form 


* 0,1 + Ap) + a, 
Fh = a exp|(00- 0: +4p)- 05) 5], (7.19) 
I 


where Ap is the relative density change in the perturbed medium. In these two 
expressions Gg and q can be chosen freely, provided they remain positive. 


This biasing scheme is particularly suited to deal with a situation where in 
the unperturbed case a region is filled with a strong absorber, while in the 
perturbed case the absorber is removed. Applying classical correlated tracking 
(with or without a 6-scatterer) would lead to the undesirable consequence that in 
most collisions the correlated history (which should actually survive!) would be 
terminated together with the reference history. Only in the rare event of its 
survival would the correlated history continue with a very high weight giving 
rise to strong statistical fluctuations. Therefore a scheme which satisfies the 
following two conditions is required: 

- the reference history should not be interrupted too frequently by absorp- 

tion in the region being subject to voiding or a density change; 

- the weight of the correlated history should remain unchanged as it crosses 

this region. 


To fulfil the first demand øp is chosen such that a certain - not too small - 
fraction of histories can cross the region without colliding. To satisfy the second 
condition we choose 


03=09-(1+Ap)o,, (7.20) 
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which leads for the correlated history in the case of 4p = -/ to a likelihood ratio 
( T*ITo ) = 1, having the obvious physical meaning that the particle crosses a 
voided region without a change of weight. 


After the determination of the site of collision the survival probability for 
the reference history is 


Pna= (Pst+Q(1+q) , (7.21a) 
and for the correlated history 
P na = [p{1+Ap)+q]![(1+4p)+q] . (7.21b) 


As in the previous case the weight of the reference history and the correlated 
history are multiplied by Ppa and P* na, respectively. 


In determining the type of scattering either real scattering or pseudo 
scattering in the forward direction can occur. Real scattering is sampled from the 
pdf 

C, = psl(Dst+4) » 
and 6-scattering from/ - C,, ie. 
Cs=4!(pst9) , 
where p, = scattering probability of the real medium. As was shown above, this 


leads again to multiplicative weight factors for the correlated game. They are for 
collision with the real medium 


{C*ICo}s i= (1 + Ap) (ps +q) /[(1 + Ap)ps +q] , (7.22a) 
= 0 if Ap=-1, 
and 
{C"lCo}s,8= (pst q)/[(1 + Ap)p, +q], (7.22b) 


for a collision with the 6-scatterer, 


If do = -1 and o,> ap, the weight factor for a fictitious straight-on collision 
may become (p,+q)/q > 1 in the few 6-scatterings that occur as is the case if 
Ps>q. Since large fluctuations of weight factors necessarily lead to a large 
variance it is advantageous to enhance the 6-scattering by multiplying q by a 
factor 7 which changes the pdf for collision-survival to (p,+nq)/(1+nq). The 
weight factors for the two types of scattering become with this assumption 


{C*lCq}5; = (1 + 4p) (p, + 9q)/ [1 + Ap)p, +q], (7.23a) 
and 


{C"\Ca}s 8 = (Ps + na)! {i + 4p)p, + q]n} . (7.23b) 


It is now possible to choose for any "reasonable" q (=o9 /o,) a factor y such that 
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the weight factor given by (7.23b) does not exceed 2, while (7.23a) remains 
always less than /. Including these changes in the unperturbed game means that 
new weight factors must be introduced: 


{C*ICa}s i= (Ps + 04)! (pp +), (7.24a) 
and 


{C"lCa}s 6 = (Ps + nqa)! (ps + Qn] . (7.24b) 


The criteria for choosing an optimum parameter combination of q and y 
seem to be complex and to a certain extent problem dependent. Nevertheless, as 
will be shown in more detail below, within a history the weight of a particle 
should decrease or increase monotonously as the particle propagates from 
collision to collision. Large fluctuations of the weight factor, especially if factors 
greater than one are involved, lead to large variances of the result and should 
therefore be avoided. 


In KENEUR the program selects between the two "double biasing" 
schemes on the basis of parameters specified in the input. If in a region subject to 
perturbation the total cross-section of energy group Eg is o, (E,) < ao, then 
"double biasing" in a transparent medium is assumed and 


05(E,)=09-0,(E,), (7.25a) 
If, however, o, (E, ) > G9, then "double biasing" in an absorbing (nearly black) 
medium is chosen and 

03 (E,) =o9- (1+Ap) 0,(E, ) (7.25b) 


which in cases of a total removal of the absorber, i.e. (J + dp) = 0, leads to 


05 = Op. Furthermore it is assumed that nq = J. 


7.2.5 Correlated Sampling in the Case of Material Replacement. In the 
perturbation schemes described above, we dealt with density changes which may 
be either small or even extend to the complete removal of a material. As has been 
indicated, a number of problems can be solved with these schemes. But in quite a 
few practical applications the interest is focussed on the perturbation effect 
associated with the exchange of one material (or cross-section set) for another 
one. It will be shown that it is also possible to attack this task by correlated 
tracking if it is applied to all phase-space transitions. A special correlated 
sampling scheme allows for the determination of differential effects (with regard 
to a standard solution), almost independently of the variance of the Monte Carlo 
results caused by the finite sample size. 


In the discussion which follows we distinguish for simplicity between only 
two input sets a, 8 and assume that a system consists of a configuration in which 
the (possibly very small) effect of exchanging the input parameters a by £ is 
studied. 


106 H. RIEF 


The simplest approach would be to associate a particle for which the phase- 
space transition parameters are sampled by a regular random walk procedure in 
system a with a correlated one which refers to 8. With this assumption the 
biasing factors would assume the general form (F, / Fg) in system B. In other 
words, the weight which is carried along by the perturbed particle would be a 
product of these factors. It is obvious that each of these factors must remain 
finite, a condition requiring that 

Fa (xj) >0 for all Fg{x;) >0 

In quite a few cases of practical interest this condition is not satisfied. It 
may occur, for instance, that a phase-space transition probability for a set of 
input parameters vanishes in system a while in system # they are larger than 
zero. To overcome this difficulty we construct a reference game denoted by 0 for 


which all pdfs. describing transition processes are a (weighted) mean of the pdfs. 
referring to the materials @ and B such that 


Rax >Fa (x; ) [Fo (x; )>0 and Rma > Fg (x; ) [Fo (x;) > 0 for all Fo (Xx;) > 0. 


Rmax is a maximum likelihood ratio which must not be exceeded. 


Now the reference (unperturbed) game is performed for this fictitious 
material characterized by - sometimes rather artificial - pdfs., while the responses 
for material œ and £ (and possibly others) are both calculated by correlation. 


An additional advantages of this method consists in the fact that by a 
proper choice of the reference parameters one can avoid large fluctuations of the 
likelihood ratios {Fy /Fo} and {Fg /Fo } and therefore limit the uncertainty of the 
differential effects. 


In the particle transport equation, the kernel K; is composed of the 
probability density functions which describe the different transitions in phase 
space such as transport from collision to collision point, type of collision energy 
and angle distributions, etc. and which are denoted by the capital letters T, C, T 
(or generally F). 


Sampling of Correlated Collision Events: In the following considerations it is 
assumed that a set of cross-sections a is replaced by another set b in one region 
of a multi-region problem. The total reaction cross-sections are denoted by a, p 
og, for system æ or ĝ and 


_ Catt oy) 


iy uE (1.26) 


for the reference case. 
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If the spatial collision distance s is sampled from Tp = og, exp{-a, s) the 
biasing factors become for system a; 


Ta) _ 
nA = z= * exp|(0o, - Oa) si], (7.27a) 
and in the same manner for system £: 


T; 
(2) = = exp[(o, - op) si), (1.270) 


These two biasing factors have the serious disadvantage that one of them 
has a positive argument in the exponent and can therefore assume very large 
values if the intercollision distance s becomes large. 


The introduction of pseudo-scattering in the forward direction: To avoid this 
shortcoming, a ĝ-scatterer in the forward direction is added to all three total 
reaction cross-sections. In each case its size is chosen such that the sum of the 
total reaction cross-section and the 6-scatterer adds up to the same value for all 
three cases. In the scheme described here and realized in KENEUR the following 
choice was made: 


Oat % 1) 
gj = Cato) at (7.28) 


The total reaction cross-section of the reference case is then augmented by 
a multiple q (21) of a3 i.e. by qaş in the reference system and by (q-I)og and 
(q+I)oz in system & and £, respectively. 

In KENEUR the parameter q can be specified. Its default value is 2. Instead 
of providing q in the input it is also possible to assign to a geometrical region a 
fictitious total cross-section g; (= Sop + qos ) which is the same for all energy 
groups. In this case the program determines q from the relation 


= [o7 + 00,(E,)] 
= at ie ‘ 


(7.29) 


for each energy group i. 


With these assumptions the transport kernel Tg determining the distance 
between collision points becomes 


To = (094 + 905) expl - (094+905)5;] . (7.30) 


Assuming that og < og, the two biasing factors referring to the systems @ and B 
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take the form 


(a _ Gast G+ Ios 


Toli 04 +405 explo, +405- O44 - (4 + 1)o4}s}} =f. (731a) 


T; og: +(q-1)0, 
BN we. Bt ó m = dlr 
z } — Og4+qo5 expf[0os + 495- ogs- (q- Dos]s} = 1. (71.310) 


After determining the site of collision the particle's survival chance in system 0 is 
sampled from 


_ Oos + 9% (Oas — Ogs) 


Pos = 01 +405 with O50 = 7 (7.32) 
This leads to the following biasing factors in the two systems a and £: 
[2a _ Oas +(q+1)05  Oo4+405 _ Gast (q+ 1)o5 (1.338) 
Pos Oot +q05 Oos +405 Tos +q l 
Zea) _ Ogs +(q-1)o5 
| Pos Cos + 405 (7.33b) 


The surviving particle is subject to either a real scattering or a 6-scattering in the 
forward direction. The probability of a real scattering is 


Co, s = %,s (Oos+4q05). 
Particles in the two parallel games are accompanied by the following 


biasing factors in which we include those derived for survival (not being required 
explicitly) 


| P Fas (0s +405) Oas t+(q+1)o5 _ Fas (7.34a) 
Co,s )\Po,s Laas + (q + 1)os] 00,5 90,5 + 406 0,5 

CG jz | o 

bs PBs) _ “Bs 

Cos Apos Das" (7.34b) 


In the case of 6-scattering which occurs with the probability gag /(o9, + qog) the 
weight factors take the form 
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fas [Pas = (qt los (09,5 + 905) ; Oas +(q+1)o5 = f+ Ll (7 35a) 
Co) \Po,s [oas+(qt+1)o3] 905 Oos +405 q’ i 
C 
pa lee sy Bip De (7.35b) 
Co,.6)\ Pos q 


Optimized multiple correlation: The scheme derived above works well if the 
number of collision processes in the zones of interest is small or else if in each of 
the systems (a and f) the biasing factors for real isotope scattering and for 
pseudo scattering (i.e. the expressions 7.34a and 7.35a as well as 7.34b and 
7.35b) differ only by little from each other. 


In cases where many collisions with quite different biasing factors occur, 
effects of "over-biasing" have been noticed in some practical applications. In 
such situations one observes after a seeming (and possibly smooth) convergence 
a sudden jump of the differential effect. In most problems this deficiency can be 
overcome by an optimal choice of q and a value Co „s Which replaces Cg. The 
introduction of this auxiliary pdf Cg, leads to likelihood ratios ba y and bg y as 
follows: 


7 se Pee _ Past (qt 1)05 O01 +405 Bus J, 
E Cis Po,s 902+ 995 Ogs +405 Oas + (q+ 1)05 Cos 
Ons 
a e ( 7.36a) 


Cos %.s +406 


for real isotope scattering, and 


ie ca Pa = _1 Utos (7.36b) 
0,6 


POs 1-Cos % s+ 9% 


for 6-scattering. Similarly in system £ the biasing factors take the form 





Cp.s eee 1 _ ps 
bg, = j=} = = MM, 7.37 
Br e POs Cos 90,5 + 406 ( a) 


C 2 | I (q- le, 

BO || PBs ô 

b. = jy = —— . F: 

pð 4 P0,s l- Cos 00,5 + 406 ii 
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With the introduction of Cg „s the following biasing factors are required in 
the reference system 





C, o, 
bor = [l = =” TE , (7.38a) 
Cos Cos Fs + 9% 
Chó 1 qos 
bonad PA a es 
- eal 1-Cos %, s+ 9% ` lai 


Now a generally applicable criterion for an optimal choice of Co „s and q 
must be found. It is obvious that the optimum parameter combination of Cy „s and 
q would be the one in which 


bay = bas and bg, = bps 3 


The two conditions lead to 


= a = Lasts 
Co,s 7 205 + Oas — Bs and q = Cas — Ops š (7.39) 


Since Co „s and q must both remain positive this optimum parameter choice is 
only possible if ogs > og, and o,,<0g, as required initially. In practical 
applications these conditions are sometimes satisfied if fuel or moderator 
material is replaced by an absorber. But even in cases where the requirements 
mentioned above are not satisfied it is possible to choose Co „s and q such that 


both terms 
i) (By 
bad bg.6 


To this end in addition to Co „s and q a third parameter c is introduced, such 


are close to one. 


that 


b b 
as = ¢ and Bs $ = 1 4 (7.40) 
ó bps c 


N 


Solving these two equations one obtains 


2 
CG, = Sa Ops and q= Caste Ops (7.41) 
a 2005 + (Oas - cogs) Oas cog, 
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In KENEUR the program searches for a c? value as close as possible to 
unity for the "reasonable" condition that gos <3o9 , (remember: 0,*> og, + qog). 


Again the reader is reminded that this optimization procedure is only of 
importance if long collision chains are occurring in a perturbed media. In fact, 
the longer the collision sequence is the more likely it becomes that the multipli- 
cative variance term (7.38) gets important. Changes in small regions which are 
the main areas of application can easily be analyzed with the original formulae 
(7.36) and (7.37). 


In any case, if it turns out that large weight factors cause problems it is 
possible to mitigate their effect by the use of "weight windows" which implies 
Russian roulette and splitting. KENO-4 provides such a possibility automatically 
and also Seifert (50) mentions this possibility. In large regions where the 
differential effect of two materials with quite different cross-sections is analysed 
it may, however, happen that for a low value of the upper bound of the weight 
window the splitting takes too much time or even worse that there occurs an 
overflow of the buffer which stores the split particles. 


Multiple correlation in multigroup energy transfer: In an energy dependent parti- 
cle transport problem using multigroup cross-sections a "mean" transfer matrix 
for two (or more) isotopes can be constructed which refers to the reference case 
of system 0. Assuming that a;, and b;, are the probabilities that a particle of 
system & or f jumps from energy group i into energy group k, one obtains the 
"mean" transfer probability 


Cik = (aig + big)l2 (7.42) 
where 2,4; ,= 1 and Za; = 1. 


For the biasing factors referring to the correlated games a and f there 
follows 


2b. 
{722} - Zas a #1 - Te (7.43) 
Toe aig + bik Io E aik + bik 


Multiple correlation in case of anisotropic scattering: Finally there remains the 
task of constructing a reference model and biasing factors for the case of 
anisotropic scattering. In some neutron transport codes, anisotropy is limited to a 
P} approximation in which u = cos? is chosen from the pdf: 


y=}0 + (2Ẹ-1X(1-/0/) , (7.44) 
where @ is the average cosine of the polar scattering angle. 


This distribution function is a step function in which, depending on the sign 
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of u, the p-values are equally distributed between jp and J or -1 and fg, where 
Mo = O-(1-/u/) = 2u-1 if @>0, (7.45a) 
Mg = O+(1-/u/) = 2u+l] if 0>0. (7.45b) 
The following correlated tracking scheme is a good example of its use in 
conjunction with discontinuous functions. 


The probability of choosing a u value in the interval (—1, Mg) or (ug, 1) is 

unity and the amplitude I of the distribution function 
P=1/{2u-1}. 

Contrary to the previous examples, where the reference game is played with a 

kernel consisting of an arithmetic mean referring to two or more materials, here 

the kernel for the reference game is chosen in a different manner. First we choose 

O=0 if 0,:63,<0 (7.46) 

if the two systems @ and £ have u-values of different sign or if one or both are 0 


(i.e. they scatter isotropically). If on the other hand 84. 8g > 0 we choose the 
reference value @p to be 


A = sign(O, ) min(/8,/,/O3/) . (7.47) 


Whilst with the help of 8ọ the scattering angle in the reference game is chosen, 


the biasing factors (1, /To}, {Tg/To}, etc. characterizing the correlated systems 
are determined by: 








Toa) 1-16 B wae 
{244 - ier ee ae diis 


if sign (0g ) - Up > 2 [04] — 1, else {Te /To) =0. 


7.2.6 Determination of Reaction Rates. Reaction rates are defined as the 
product of the neutron flux g(x) and the macroscopic reaction cross-section o,(x), 
or the product of the collision density y(x) and the reaction probability p, (x) = o, 
(x)/o,(x): 


= Wx) X 
p(x)o, (x) = o(x) 0,(x) = Y(x)p,(x) . 
From (7.2) it follows that 
PXPX) = PAX) > p(x) - (7.49) 


n=0 


For correlated sampling schemes this implies that in each collision point the 
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weight factors (products of likelihood ratios) belonging to the different systems 
under consideration are multiplied by the corresponding reaction probabilities. 
As has already been shown in equation (7.34) the weight factors are the products 
of the likelihood ratios for all phase-space transition processes (collision, energy- 
angle change, etc.) up to the point where the reaction rate is calculated. The 
calculation of correlated fission rates, as described in 8.2, constitutes a typical 
example of this kind. 


7.2.7. A 'Partial' Correlation and Splitting Method. Section 7.2 would be 
incomplete without mentioning a ‘partial’ correlation method which proved to be 
quite efficient if only relatively small geometrical regions are perturbed and 
surface splitting can be applied. The method is very simple, but the degree of 
correlation and hence the efficiency of the sampling scheme, varies from case to 
case. 


Let us assume that the effect of filling a small geometrical region with 
different materials is to be analyzed. In this case a particle starting from its 
source-point is tracked as usual until a perturbed region is entered for the first 
time. Up to this point, the scores are the same for all material compositions. 
Now, splitting may occur and the neutrons 'see' in a predefined sequence the 
different materials in the perturbed geometrical region from where they may 
return to the whole system under consideration. Depending on the material in the 
perturbed region the scores are attributed to the different cases of perturbation. 
Contrary to correlated tracking, there is no need to introduce a ‘reference’ system. 


Even if splitting into many branches is allowed the total running time does 
not increase substantially if the perturbed regions are sufficiently small and 
histories do not enter too frequently. Moreover, there are no problems with 
strongly fluctuating weights typical for correlated tracking in some situations. 
Furthermore, the method is much more efficient than the simple method of 
tracking histories in the different systems with the same chain of random 
numbers. The method is used in TIMOC-72 (18) and OMEGA/P,(50). 
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7.3 Derivative Operator Sampling 


7.3.1 Formulation of the Method. As shown in Section 6.3 a quite different 
way of estimating perturbation effects is the differential Monte Carlo method. It 
is based on a general or multivariate Taylor series expansion of (x): 


2 r 
Aside) = wx + 4p) = Wip) = SEP) dp + 3 HPL apy +.. (150) 


This method implies the estimation of parametric derivatives. The estimators for 
the various derivatives of the collision density w(x) can be derived directly from 
the Neumann series as shown in Section 6. Equation (7.2) differentiated once 
gives 


co n f) i n 
R T as 


where the square brackets stand for the regular Monte Carlo game and the curled 
for the estimator of the scores giving the first-order derivatives of w(x). 


The second order derivative of w(x) and the estimator for cross terms are 
the same as in Equations 6.6 and 6.7, i.e. 


- Š fet Jind ar. Sa- Eaj [fix]. asw 


i=0 





and 


et - par du, . alg dug pan ET -w ch 


© OK, SOK, Wa" 
rE rE WT x . (15%) 


In much the same way higher-order derivatives of w(x) could be derived. 





As in the previous section on correlated tracking total or partial cross- 
section changes are considered once more. In many practical applications it is 
preferable to deal with changes of the relative nuclear density p of a macroscopic 
cross-section Oy, Le. 


do = OP reat/Prea and p=1. (7.53) 
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From K; = C;T; (equation 7.3a) it follows that 


K OC, AT ani 
Kida Gdo Tp’ ; 


i.e. the collision and the transport kernels can be considered separately. 


73.2 Change of the Relative Density (total cross-section) of a Medium. As 
shown in equation (7.5) a density variation of the medium changes all partial 
cross-sections in the same manner. The estimator for the first derivative of K; 
referring to the remaining transport kernel T; = o, exp( -o; s; ) leads to 


oT: 
Lrg = E-o). (7.55) 


while the estimator for the second derivative is given by 


nan waT 
Dar a [e-o | =- n, (7.56) 





where n is the number of collisions. 


In a multiregion medium where a particle crosses the regions m and 
collides in M, the transport kernel can be expressed by 


T; = Py of! a- Pm of!) sn) ’ (7.57) 
m i 


which leads to the estimator 


OT; 
Lita, = malon = og'a» (1.58) 
i i 


where u = index of the perturbed region. Hence the estimator of the second 
derivative becomes for the n-th collision point: 


a°T, wo. 4) 
KB 2 > Lou, u/Pm -= Oy yul - > Ou); 
T; dp; i i , 


i 





(7.59) 
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7.3.3 Derivative of the Relative Density of Only One Isotope in a Mixture. 
Similar to equation (7.5), one expresses the collision kernel in the form 





ae OS Pk Iki ree Q, BOs), 
rae» Sawer (. yea yky Thij, (7.60) 


where p,o,{/o, stands for the probability of a particle reaction with isotope k. If 
in each collision point only one single collision event is chosen (as is standard 
practice), then the estimator for the first derivative of [7 C; becomes 


(t) 
Pye a = >| = a Sing (7.61) 
C; Fu = i Or f} 


In this equation the ratio 0,{/o, is sampled, provided the mixture in which 
the collision occurs contains isotope x. On the other hand 6, ,, indicated that this 
part of the sample is taken only if a collision occurs with isotope x. 


Similarly the estimator of the second derivative of the collision kernel 
becomes 





ð | © \_ o!\2 Oy 
Day aes) 7 SS) -i j (1.62) 


The same holds for the transport kernel T; in a multiregion-multi-isotope 
problem, where the derivatives with regard to the density of isotope x in region u 
lead to the estimators 


T; = [Ep of? Jeol 2 Pim ofsa) 3 


dT; of!) 
a ia >| of snb ng p (7.63) 


7 i Wry 


and 


se i 
3 wu \Ti Or u TA off li” (7.64) 
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In the case of evaluating the cumulative effect of two or more cross-section 
changes (e.g. different isotopes or different energy groups) the cross terms of 
expression (7.48b) must be calculated too. In this equation only the higher-order 
partial derivatives require some explanation. The other terms are determined as 
shown above. For the previous example it follows that 


2 


i 


l aC; APT, ] 
E = | = 
C; Pr uP u Tj Px IPA y 


2001 — off) of! of! 
| ~ -— — Oy + oof s?! Smu . (1.65) 


(ofp? oy off i 


The terms multiplied by 6, J» oF 6, respectively are only subtracted if collision 
occurs with isotope A or x. 


7.3.4 The Application to "Double Biasing". In the case of "Double Biasing" as 
described in Section 7.2.4 the cross-section o,(E) is replaced by a constant or an 
energy dependent cross-section og to which øș, the cross-section for a fictitious 
forward scatterer (without energy change), is added. In differential Monte Carlo 
as described above thedensity p of the reference history is subject to differential 
operator sampling, while og is kept constant. 


Differential operator sampling carried out simultaneously with correlated 
tracking provides valuable complementary information in the analysis of a 
perturbation problem. In particular it indicates the validity range of linear and 
quadratic perturbation methods. To determine the first two coefficient of the 
Taylor series the following derivatives referring to the different transition kernels 
have to be scored: 

a) The transport kernel from which the intercollision distances s; are 
sampled: 


T; = (pa, + 05) - exp[-(o, + 05)5;] . 
The scores of the derivatives to be taken at each collision point are 


es ee ee ee (7.66) 
Tido © Portos ' p+q 


and 
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ð ( ð ) -1 
Gj) = ——— ; 7.67 
PIB") = Tage (7.69) 
b) For "real" media scattering sampled from the pdf 


POs po,t+o5 


C; Pra = —— 
Sona 03 +05 portos ’ 


the scores of the first- and second-order derivatives are 





ECE sre i 7.68 
Cp Pra = patos) ~ Tra’ ii 
and 
2( a —(2p0, + 05 05 
-| —~—C = 3 7.69 
BAG, gp Pra) = po, + os) qen 
c) For ĝ-scattering sampled from the pdf 
= Os Port 
CoPna = po, +05 po, +05 — 
the scores of the derivatives are 
2 o _ h% q4 (7.70) 
Coop $ ra = poros = Tra 
and 
d{ a o? 1 
Jœ c )- — s. (7.71) 
rater oPna} (patos (+a 


With the above expressions the derivatives of the collision density w; in the i-th 
collision point can now be determined according to equation 7.47 and 7.48. The 
score of the derivative of a source particle starting in rg and undergoing first 
collision in ry is 


Kp j 
Kodo -F s 


The score in the second collision point becomes: 


d OK; 8 5 a 
Lk = na t Gayl’ elo pgi re] ng 
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and depending on the type of collision in (r;, E;) either C,S) or C,is used. The 
aa form is 


Freeh D3 (or C,Paa) + op CaP), 


where m and n refer to "real" media and "ô-scattering" respectively, with the 
condition that m+n = i-]. The same procedure holds for determining the second 
order derivatives 


a ar Large ‘+ Loer CsPra), +E la pha), 1D 


To calculate a second-order Taylor series expansion for the change of a 
reaction rate, as for example the fission rate př = pvop/(orto5), the following 
two quantities are required at each collision point: 


d + Ba > Opi prvidr 
op Pf nyp = (p+ (erat Yet pf EJ where p= Fog A (7.73) 
8? * = oY; * q dpi á * q 

ap? pY: = ae + 2p, (i+q) op 2p; (+r Yi. (7.74) 


With these quantities one calculates for example the elements of the 
perturbed fission matrix as shown in Chapter 8. 


In many practical applications it became evident that there is an excellent 
agreement between perturbation effects calculated by the double biasing scheme 
for transparent media and the ones calculated by a second-order Taylor series 
expansion, even in cases of voiding. 


7.3.5 Uncertainty Estimates. As in the case of correlated sampling the variance 
of an infinite medium problem can easily be calculated analytically. In analogy to 
Equation (7.11) we obtain for the score of 4y,, in a first-order (linear) Taylor- 
series approach: 


n i I 
(Ayh) = ja dì. San ($a + pjes -407 1) = | 
i=] jel 


xp"1(1 - p) e- ya) (7.75) 
j=l 
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Again <4y,> = 0 as one would expect. For /=2 we obtain by iterative integration 
<Ay,2> = Ap? p”!(I-p)Xi? which, after summation over all possible history 
lengths, leads to the variance 


13) = LEB = ason, or 


which agrees with (7.13). Again we note that even if the variance increases 
strongly as the collision survival probability approaches unity, it always remains 
bounded. This also holds for the second-order derivatives (41, 44) and most 
probably for the higher orders too. In (41) it could be shown that the variance of 
a second-order Taylor series approach is usually less than that of an equivalent 
correlated sampling case. This follows also from a comparison with expression 
(7.12a) for d4o—0. And whatever is valid for an infinite medium, holds a fortiori 
for practical applications in finite geometries - an advantage which is quite 
substantial when compared to correlated sampling. 


The infinite model problem with isotropic scattering considered above has 
been extended by Koreshi and Lewins (24, 25) to a one-dimensional finite slab 
problem with forward scattering. Here, the inclusion of perturbation factors in 
the integral formulation provide a clear understanding of MC perturbation theory. 
In this part of the investigation, it has been possible to demonstrate that the 
perturbation formulation may be interpreted in one of two ways - in the first, the 
required estimate is made over an event in the phase space during a history while 
in the second the required estimate is made over the entire history (see also Hall, 
8). In the former the estimate is weighted by the probability of a particular 
random walk occurring. The mean of the estimates was found to be the same in 
both cases since they are different ways of looking at the same problem. The 
perturbation terms, which must be used in the case of the collision estimator or 
the track-length estimator for estimation of boundary currents, were clearly seen 
from this development. These terms, using the collision estimator for evaluation 
of volume averaged scalar fluxes, have been incorporated into the MORSE code. 
The perturbation analysis was then applied to a simplified problem. The sampled 
derivatives were used in the Taylor series and compared with re-computed MC 
estimates from MORSE which were considered to be 'exact' results. From the 
discrepancies between predicted estimates and 'exact' estimates over a wide 
range of parameter variation, it has been possible to appreciate the quality of the 
results particularly with regard to the distance from the source. Again at larger 
distances from the source the derivatives are inaccurate as the collision density 
itself is not sampled well. 
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7.4 The Realization of Perturbation Schemes 


The different schemes outlined above have been incorporated in one form 
or the other in existing particle transport programs. Most of the programs are, 
however, still in an experimental stage or not yet released by the author. It is a 
common feature of all the programs mentioned here that energy-dependent 
sensitivities of any number of responses can be calculated for 3D geometries in a 
single Monte Carlo run at the expense of a very small time penalty. It is a further 
characteristic of these Monte Carlo codes that all the sensitivity information is 
available simultaneously, whereas conventional methods require a separate ad- 
joint calculation for each channel of experimental information. Also group- 
averaging errors can be avoided by the use of point nuclear data. A list of of 
these programs is given below: 


TIMOC (18,38) is a mul- 
tigroup Monte Carlo 
code using discrete ener- 
gies in elastic or inelastic 
scattering processes for 
which different models 
of anisotropy are pro- 
vided. The geometry ca- 
pabilities are the same as ANISN-SWANLAKE J (ANISN) = 381E-3 [nis] 
in MORSE. The program E ANCE ESES LINE 
can be applied in 'stand 
alone' mode or as a mod- 
ule of SCALE 4.2 using 
its ANISN format data 
libraries. It is equipped 
with correlated tracking 
and with first-order dif- 
ferential operator sam- 
pling algorithms. Depen- 
ding on the input specifi- 
cations the two schemes 
may be used either sep- 
arately or simultaneously. 


Addl Jt 





r (sphere) = 10 mfp 
olo, =0.9 


TIMOC calculates per- Figure 7.1. Change of fluence as a function of the rel- 
turbations and deriva- ative cross-section change Ao,/o, estimated by the 
tives with respect to sin- TIMOC correlation option for a monoenergetic 
gle isotope densities in sphere with a central point source. Results are com- 
mixtures and partial pared to a first-order ANISN-SWANLAKE calculation. 
cross-sections (elastic, 
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inelastic, etc.), for the energy dependent flux and current at region boundaries. 
Other estimators for which such quantities are determined, are the track-length 
estimator providing the mean value of the flux derivative in a geometrical region 
and three different point estimators: ‘last collision’, ‘once more collided’ (Kalos, 
21) and 'track-point' (Rief and Dubi, 43) estimator. 


In order to establish the necessary confidence in the new Monte Carlo 
methods dealing with differential operator sampling, a series of calculations were 
performed which could be compared with equivalent ANISN-SWANLAKE 
results based on exactly the same group cross-sections (39). 


The first example, shown in Fig. 7.1, deals with a monoenergetic, 10 mean- 
free-path radius sphere with a point source at the centre. It represents a typical 
deep penetration problem and is, therefore, significant for shielding applications. 
The graph shows the relative change of the total current J, as a function of the 
fractional change of the total cross-section 0, As ANISN-SWANLAKE allows 
for linear perturbation only, its validity range is rather limited, if compared to 
correlated tracking used in TIMOC. 


In the second test case a multi-group problem is analyzed in spherical 
geometry. It consists of an iron sphere with a central californium source. The net 
current and its sensitivity profile were calculated at the surface. In both methods 
the same 55 group cross-sections generated from ENDFB-4 in the energy range 
674 keV up to 14.9 MeV were used. So as to make easier comparison possible 
and reduce statistical uncertainties in the Monte Carlo analysis, results are pre- 
sented for four macro groups only. As usual, in Table 7.1 (below) we define 


ae dJ, Wy, 
Sensitivity = tae 
Energy Group ANISN - SWANLAKE TIMOC 
[MeV] fluence sensitivity fluence Sensitivity 
.0674 - 0.334 0.282 - 5.9 E-2 0.286 - 5.0 E-2 
0.334 - 1.353 0.533 - 4.7 E-2 0.533 -4.3 E-2 
1.353 - 4.724 0.094 - 2.0 E-2 0.095 - 3.3 E-2 
4.724 - 14.9 0.005 - 0.26E-2 0.0046 - 0.20E-2 
.0674 - 14.9 0.924 - 0.13 0.918 - 0.13 


Table 7.1. Iron sphere with central Cf-source, r= 20 cm 
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DUCKPOND (8) is a very versatile neutron and gamma transport code using 
point cross-sections. The program allows for the calculation of first-order deriva- 
tives in complex geometries. It has been used extensively for uncertainty analysis 
and nuclear data adjustment (10, 9, 32, 12) in conjunction with covariance 
information. 


MORSE Modified: A second-order derivative sampling scheme was implemented 
into the MORSE code (53) by Koreshi (24) for a study of the ETR TIBER-II 
blanket design, leading to a neutron sensitivity analysis with applications in 
design optimization. For this problem derivative sampling certainly has been 
found to provide a valuable methodology for sensitivity analysis and has been 
demonstrated to result in substantial computational efficiency. Finally, the results 
have been curve-fitted to a normal distribution using standard y? - and Kolmo- 
gorov-Smimov tests and have been shown to follow this distribution within the 
prescribed level of significance. For fusion problems, the MC perturbation 
algorithms, for second-order derivative sampling, appear to work quite well and 
have a potential for use as tools for sensitivity studies and design optimization. 


KENEUR and OMEGA/P which deal with multiplying systems are described in 
Section 8.4. 


VIII. MULTIPLYING SYSTEMS 


8.1. Eigenvalue calculations in critical systems. 


In linear operator sampling the eigenvalue problem requires additional 
considerations. In this case a homogeneous linear equation is solved describing 
multiplying media with a self-sustaining chain reaction (critical systems). The 
key quantity is the multiplication factor defined as the dominant eigenvalue kepp of 
the following homogeneous integral transport equation: 


kpdx) p(x) = fpd) K(x ypy vvd. (8.1a) 
or in a more compact form 


kalz) = [Pæyaly)dy , (8.10) 
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where: 
p(x) = vox (x)loy (x) defines the emission probability of fission neutrons 


q(x) = pr (x)p(x) the expected number of fission neutrons emitted 
isotropically in a collision process in dx at x ; 


P(x,y) = pr(x)K(x,y) the expected number of neutrons generated in dx at x 
due to one neutron starting in y. 


In Monte Carlo and in deterministic procedures multiplication eigenvalue 
problems are solved by the method of power (or source) iteration. Beginning 
with a best guess for gg (x), equation (8.2) is iterated until k, converges to a 
stable value (= keg): 


_ hiha _ " [kav + 1(¥)dx mp 
fka (x)dx [pav(xdx 


Ky +1 

Beside this iteration scheme it is also possible to calculate the multipli- 

cation eigenvalue by the fission matrix method. In this case equation (8.1b) is 

discretized by subdividing the fissionable space into a mesh of appropriately 

chosen volume elements for which the mutual fission probabilities are calcu- 

lated. The lowest eigenvalue of the resulting fission matrix is the multiplication 
factor Keg of the system. 


To calculate the fission matrix P itself, for starter particles in each of the 
volume elements i, the response - in this case the fission rate p; j (elements of P) 
in all the volume elements j - is determined. Suppose that the space over which pp 
(x)>0 holds is divided into N discrete volumes V; where i=/ to N. Then 
integrating over a volume V; gives 


k- qi = DPI 4% (8.3) 
J 
Here 
qi = fy axddx , (8.4) 
is the total number of fission neutrons produced in regions i and 
i 
Py = al dy), dxP(x,y)q(y) , (8.5) 


is the expected number of fission neutrons produced in regions j due to one 
fission neutron starting in region i. In terms of a Neumann series sampling 
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scheme we obtain 
Pij = fyo pdx > fdin be f,aisf, to K(Xx,U,,/0)K(UpUy_1'P). . K(uj,Uo;p) ’ 
n=] 


(8.6a) 
where the neutron source-term 
K(u;,Ug;p) = T(r}, ro; Ey, P) S(ro, E1) (8.6b) 


has a spatial source distribution S which is either flat in V;, or it is taken from a 
source iteration procedure, while E is governed by the fission spectrum. Writing 
q= column (qj) , 
P= matrix(pj;) 7 
allows (8.3) to be written as a matrix equation 
kq =P4q. (8.7) 
This is now in the form of a classical eigenvalue expression with q 
representing one of the eigenvectors and k the corresponding eigenvalues. It can 
be demonstrated that there is a unique real, positive eigenvalue, keff which is 
greater than any other eigenvalue. This keff corresponds to the effective 
multiplication factor as calculated in the usual Monte Carlo simulation. 
Furthermore, the eigenvector q associated with keff may be shown to be the only 
one which is everywhere non-negative and hence physically meaningful. The 
value of each element i of the eigenvector q is proportional to the number of 
particles generated in V; and hence the eigenvector represents the power 
distribution of the multiplying system after settling down to the fundamental 
mode distribution. 


To solve equation (8.7) the initial problem is that of calculating the matrix 
of py 's. Once the pij have been evaluated the eigenvalue k and eigenvector q 
may be evaluated by standard matrix methods. In the majority of these ap- 
proaches the simplifying approximation is made that the neutron source gq;, is 
constant over each region. Alternatively, the Pij may be calculated as part of a 
Monte Carlo simulation based on the source iteration concept. With this ap- 
proach the uniform source approximation is no longer required. Since the Monte 
Carlo method effectively proceeds by a number of fixed source calculations each 
beginning with a fission source, it is a simple matter to record py as the number 
of fission neutrons generated by a particle in a given region. This method is of 
particular significance for perturbation Monte Carlo problems and is discussed in 
the following. Knowing the uncertainties of the p; is possible to determine the 
uncertainty of the eigenvalue k, as shown in (30, pp 202). 
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8.1.2 The Use of Adjoint Functions. Some remarks on notations used in vector 
and matrix algebra are placed in front of this paragraph. In what follows lower 
case bold-faced letters represent column vectors. A row vector is written as the 
transpose of the former one by the addition of a superscript T, which also 
indicates the transpose of a matrix, changing it from A = [ay ] to Al =[. aj]. A 
scalar sum can be represented as the product of two vectors of the form 


i SE eae 
xy=ylx=) xy;. 
i 
Likewise a scalar form used later is 


xTATy = TAx=) > yiayx; - 
eo 


Furthermore we make use of the fact that the eigenvalue equation 
kw = PTy , (8.8) 
leads to the same eigenvalues k as Equation (8.7). 


These relationships can be used to improve estimates of 4k in perturbation 
analysis of critical systems. Assuming that there are two systems (a and £) with 
the same geometrical subdivision in fissionable regions, the following two 
eigenvalue equations can be formulated: 


kaq=Paq and = kgw= (Pg)Tw, (8.9a, b) 


where w is the eigenvector of the adjoint equation. If we multiply from left (8.9a) 
with w7 and (8.9b) with q7 and subtract (8.9a) from (8.9b), then we get 


q'Piw-w'P.q _ w'(Pp-P.)q 


Al = hy ~ hy = Pe r (8.10) 
_ Ld wilel? -pfa — 


È; Wiqi 


According to a theorem of perturbation analysis (Ussachoff, 56) these 
algorithms approximate Ak up to quantities of second order, while q and w are 
only exact up to quantities of first order. 

Matthes (31.) proposes the procedure mainly to get an improved conver- 
gence of k and w with relatively few neutron histories. In the case of perturbation 
estimates his q and w vectors apply to the reference case only, i.e. w is calculated 
like q from 
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kaw = (P,)'w. 


The method has been extended by Hoffman et al. (14,15) and Seifert (50- 
52) to the exact perturbation expressions again with all of the relevant quantities 
evaluated within a single simulation. Another similar approach is being 
developed by Hutton (16, 17) in the MAX code. However, here the method is to 
use an adjoint function derived from a deterministic solution to a geometrically 
very simple problem. 


In the following paragraph it will be shown how the matrices P, and Ps 
can be sampled, either by derivatives and a Taylor approximation or by a 
correlation scheme. 


8.2. Sensitivity and perturbation analysis in self-multiplying systems 


Attempts to use the classical iteration procedure in Monte Carlo sensitivity 
analysis have so far failed except for a limited class of problems where certain 
approximations are acceptable. Without going into details, it does not seem 
feasible to calculate eigenvalue perturbations from one generation to another. 
Experience shows that fluctuations in the propagated quantities soon grow to 
such a level as to obliterate all useful information. 


The alternative method described here has no precedence in deterministic 
procedures. It is based on the matrix or Green's function approach outlined 
above. It solves a typical inhomogeneous (initial value) problem by a Monte 
Carlo perturbation procedure which generates two, or even more, fission 
matrices P,, Pg, etc. They are either to be calculated by correlated tracking, 
multiple correlation or a multivariate Taylor expansion. In the following the dif- 
ferent score functions are listed: 


- Correlated tracking in its simplest form renders the following estimate for pj: 
Py = h py(xiphdx > | dun.. [du fyduo 
n=l 
x K(xX,U,/P)K (Upp /P). of K(u),Uo;p) è ( 8.12) 


where pp (x;p) = v&-/Z, is the fission probability at position x in volume element 
j, p a set of parameters subject to perturbation and K(u;,%é9,) the source term 
for neutrons starting in the i-th cell calculated by expression (8.6b). For 
correlated tracking a second matrix is determined with the component 


wi H. RIEF 
E ef 


X K(Xx,up:p)K(Uptn-1;p). -K(uj,Ugsp) , (8.13) 


and w; = K(u„1U; ; p+Ap) /K(uj,7,U; ; p). Note that the kernel K as well as the 
fission probability pr may both be subject of a parameter change (e.g. a density 
change of a fissionable isotope). 


- Multiple correlation (material replacement scheme): In this case the particle 
transport (random walk) is governed by the kernel K()(u),),u,) which refers to 
the reference system, usually consisting of an artificial material composition. 
One possibility of constructing the reference kernel K0) (factored into T() and 
C(9)) and the two kernels K(¢} and K(P) correlated to it, is described in 7.2.5. The 
collision density %0) of the reference system is given by 


yx) = 2, Í du,.. Í duf duo K(x u, K upttn.1). . KO (u;,tto) ‘ 
coo R R i 


There is, however, no point in calculating reaction rates with w since the 
reference system has in most cases no physical meaning. Instead we calculate 
directly the fission matrix elements for system æ and £: 


(2) = fyo de fydun x Jdu fyduo p) 


-X KH (x,u) KO (üp tty). KO (uz) » (8.14a) 


pP = h opp y [tn .- fidu fyduo {we 


x KP) xu, )K PY upin). Kuz Up) - (8.14b) 


In this multiple correlation scheme the likelihood ratios are as usual 
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and the fission probabilities in x become 


(a) (a) 
ny Y (x) oj (x) 


yy) of"x) 
P Gx) + qos 


of(x) + qo5 ° 


d pf) = 


where of (x) = macroscopic fission cross-section inx of systema, 


w@)(x) = number of neutrons emerging from a fission process in x of 
system &. 


For completeness also the fission probability of the reference system is given 


j 2 [op x) + qos] 


After following a sufficient number of histories the dominant eigenvalues 
Kors kp (and if wanted kg) of the fission matrices Pg, Ps (and Py) can be obtained 
by a numerical procedure. The method can easily be extended to a few more 
material compositions y,6,f, etc. for which the fission matrices Py, P5, Pg, etc. 
are computed simultaneously with P, and Pg. This can be done by a straight 
forward extension of the correlation method described above. In this case the 
procedure of the previous example is again applicable and (4k; } as well as 
(var(Ak; )) can be obtained. 


- The Partial Correlation and Splitting Method: This method is based on the 
ideas outlined in 7.2.7. We consider two systems, a or ĝ, (it is straight-forward to 
extend the method to more systems) both consisting of the same geometry, for 
which a perturbation limited to a relatively small region is specified. First a 
source iteration is realized in one of the two systems, a or $ (contrary to the 
previous case a reference system is not required). 


Beginning with the fission source of a given generation, the particles are 
tracked as usual until a perturbed region is entered for the first time. Up to this 
event, each collision point contributes in the same manner to both matrices, Py 
and Pg. After penetrating the surface of a perturbed region for the first time the 
history is split into two families of x branches each. From now on, the x 
branches of family @ are tracked independently in system a@ and contribute to Pa 
only. Correspondingly, the x branches of family £ are treated. If, for instance, the 
fission source iteration is carried out in system a@, then the histories of system B 
start from a distorted source. This error is corrected for, at least partially, by the 
fission matrix approach if a sufficiently fine subdivision in geometrical cells is 
provided. With increasing values of y the effort per history (the number of colli- 
sions to be computed) increases, the variance decreases and the time for a given 
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statistical uncertainty may become a minimum for a certain value of y. It be- 


comes evident that in the case of localized perturbations a high degree of positive 
correlation between the histories in both systems can be attained. 


- Second order Taylor expansion: Knowing first- and eventually higher-order- 
derivatives with respect to a certain parameter, it is possible to calculate the 
elements p; of the fission matrix of a perturbed system. Formally a Taylor-series 
approximation is obtained from: 


P$ = Py -| 2 py) 40 +5 1% al piao +... (8.15) 


In this expression pj is given by Equation (8.7) and the other terms are 


Spi = Syd px: pL vao eap veo) 


and 
3? 3 y(x; p); , Zop) , 
ap? lg f dx vt eS a ao a W(X; p)i 
Op; (x; p) aieh] 
e Er a |" 
with 


W(x; p) = yf de . . J dui], duo) | [Kíuniup), where u,,,;=X , 
n=0 1=0 l=0 


% WX; p) = 2 >| dun. me duif, duo Jno eae Kop WK where K; = K(uj,;,U;p) 


and 
a2 IK, V SK; 1 
Spe vision = Dati -Jdu fy duo as ) * aaa JLE 





Since by far the largest fraction of computer time is used to calculate the 
py-s and the derivatives, it is easy and inexpensive to determine keg as a function 
of various values of Mp. As a by-product in these calculations one also obtains 
the eigenvector which shows the differences in the power distribution between 
the unperturbed and the perturbed system. 
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8.2.1. The Calculation of Perturbed Responses in Critical Systems. Besides the 
difference of the eigenvalues, the perturbation effect of the other target quanti- 
ties, such as fluxes and reaction rates including fission (power) densities are of 
interest. These quantities can be calculated by the forward equation. To this end 
one scores for each volume element V; (referring to the fission matrices P, and 
Pg) the (normalized) response vectors zí% and zÊ) referring to the correlated 
games. According to the physical meaning of the eigenvectors g@ and g(®) each 
element gf) andgq/4) corresponds to the fraction of particles starting inV;. The 
response of a certain target quantity r’, e.g. the energy dependent flux, etc., can 
then be expressed as 


Ty = DON re? and rg = > fr, ie (8.16) 
i i 


It turns out that with very little additional computing effort both differential and 
correlated samples can be scored simultaneously. 


8.3. Approximation of the Perturbed Source Distribution in the Fission 
Matrix Approach 


The only approximation in perturbation estimates of critical systems is the 
subdivision in discrete volume elements for which the source distribution does 
not match the correct one. As already mentioned above, in KENO-4 and in 
KENEUR the pj-s of the reference case are calculated as part of a source 
iteration procedure. This distribution is an acceptable approximation if the 
number of volume elements is large and the flux gradients of the correlated 
games are similar to the reference case. It turned out, however, that in problems 
where strongly absorbing materials constitute the perturbation, the number of 
required volume elements may become too large to be handled by regular 
computers. 


To overcome this difficulty the source distribution of the correlated cases a 
and £ are approximated by a correlated estimate of the fission probability of the 
previous generation. It requires that together with the space coordinates of a 
source neutron of the next generation (reference case) also the ratios 


Of = pp (ry) [pp ry) and w,{P) = pePry,) ipl) , 


are stored. In the next generation all multi-correlated particles are started with 
weight one but the contribution to the matrix elements py (a) and Pij (B) are 
weighted by wp® and wP): 
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oo n 
pi? =o y" Pp (X;Po+Apa)dx Dit . Jaa) deo {TT} 
n= = 


X K(x,ü „i po)K(Unln-1;Po) - - . K(Uuy,UosPo) . (8.17) 


A similar expression holds for the 8 correlation. 


This procedure provides an approximation to the correct fission distribution 
in the different volume elements V; Obviously the approximation improves as 
the number of elements increases and their volume decreases. In a number of 
benchmark tests with absorbers replacing other materials this scheme proved to 
be very effective. 


8.4 The Realization of Perturbation Schemes and Examples 


- KENEUR (42) is a modified version of KENO-4 (37). It was devised for 
calculating perturbation effects caused by changes of the nuclear data input, in 
particular changes of the density or the composition of materials in one or more 
geometrical regions. The perturbation analysis is carried out together with the 
regular sampling procedure. In one option the perturbation effects are determined 
simultaneously by two different methods called correlated tracking and 
derivative operator sampling in conjunction with a first- and second-order Taylor 
series approximation. Another option based on multiple correlation makes it 
possible to replace one material by another. 


Besides changes in Kegs the perturbation analysis is carried out for all 
quantities sampled by KENO-4, i.e. the source vector (power distribution), the 
leakage, absorption, fissions, fission density and flux per energy group and geo- 
metrical region. For all sample values the uncertainty estimates are provided. 


- OMEGA/P (50, 51) is the perturbation version of OMEGA (50). It has two 
different perturbation options. Correlated tracking is applied if the perturbation 
affects a larger region of the core, while splitting is used for localized changes of 
the material composition. Both methods allow for a parameter choice to optimize 
the efficiency depending on the problem to be analyzed. They are combined with 
the fission matrix approach in order to account for eigenvalue problems. 


- A Perturbation example which allows for a comparison with several indepen- 
dent calculations is the Small LWR of the NEACRP 3-D Benchmark exercise 
(Takeda and Ikeda, 55). It considers the removal of a control rod in a three- 
dimensional reactor mock-up applying a two energy group model. A horizontal 
and a vertical cut of the system are shown in Figure 8.1 together with the sub- 
division in volume elements as required by the fission matrix approach (dotted 
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lines). KENEUR calculations were carried out for a mesh size of 5.0 x 5.0 cm? in 
the x-y (horizontal) plane and also 5.0 cm in the vertical direction. The grid is 
placed such that the control rod fills one grid square horizontally and the full 
height of core and reflector vertically. The OMEGA/P (51) results quoted here 
refer either to a grid size of 1.25 x 1.25 em? in the (horizontal) plane and 5.0 cm 
in the axial direction, or the same grid spacing as quoted above for KENEUR. 


Reflecting 
re 








Control Rod (case 1) 
Void (case 2) 


Vacuum 








Figure 8.1: Geometrical mock-up of the Small LWR as used by KENEUR in the 
3-D benchmark exercise. To exploit symmetry only 1/8 of the assembly need be 
modelled. 


The analysis was carried out by the material replacement version of 
KENEUR since the withdrawn control rod is replaced by a follower material. In 
the table below in addition to the multiple correlation estimate also results 
obtained by independent Monte Carlo runs are also listed. They were obtained by 
different authors using different codes and numbers of neutron histories. 
Therefore, the standard deviations quoted are not a direct measure for the 
efficiency of codes or methods. In the Table 8.1 system a refers to the empty 
control road position and system B to the control road fully inserted. 


Discussion ofresults: Case d) based on a multiple correlation calculus with 
source adjustment according to Section 8.1.2, agrees within the uncertainty limits 
with the three cases a), b) and c) which are obtained by independent Monte Carlo 
runs using a very large number of histories. The same holds for case f) which is 
the result of partial correlation & splitting and a fine grid subdivision. While d) 
uses cubic volume elements of 5.0x5.0x5.0 cm? in the core region, f) is 
subdivided into four-times more boxes of the size 1.25x1.25x5.0 cm>. Both 
methods render the same results which means that the source adjustment 
procedure of KENEUR is able to substitute a fine grid subdivision. In larger or 
geometrically more complicated systems this can be of considerable advantage, 
since a fine grid subdivision can easily lead to a fission matrix of a size non- 
manageable by standard computers or workstations. If partial correlation & 
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splitting uses a coarse grid size of 5.0x5.0x5,0 cm? which corresponds to case g) 
than one observes the same overestimating of Ak as in case e) where multiple 
correlation is used without a source adjustment procedure. 








syst.a syst. B Akey lke eg?) 


a) .9781 
+.0008 


b) .9776 
+.0007 


c) .9780 
+.0006 


d) .9770 
+.0008 


e) .9793 
+.0011 


f) 9775 
+.0018 


g) n.a. 


.9628 
+.0008 


.9624 
+.0007 


.9624 
+.0006 


.9614 


.9607 


.9624 


+.0019 


n.a. 


.0162 
+.0012 


.0162 
+.0010 


.0166 
+.0009 


.0166 
+.0007 


.0198 
+.0003 


.0161 
+.0007 


.0193 


two independent M.C. runs, 
each 1600 x 4000 histories, (Gonano,Rief). 


2 independent M.C. runs, (Nakagawa). 


2 independent M.C. runs; 
average of all submitted Benchmarks. 


multiple correlation (KENEUR), 


800 x 4000 histories ,3*3*3 fission-matrix 
elements with source adjustment. 


multiple correlation (KENEUR), 


3*3*3 fission-matrix elements, without 
perturbed source adjustment. 


Partial Correlation & Splitting 


(OMEGA/P), 12*12(hori.)*3(verti.) 
fission-matrix elements. 


Partial Correlation & Splitting (QMEGA/ 
P), 3*3*3 fission-matrix elements. 


DOn 
Table 8.1. keg of a Small LWR 
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The efficiency gain obtained by multiple correlation can be evaluated as 
the ratio of the variances weighted by the number of histories required, i.e. 


4x 122/72 = 12. It is 
due to the fact that 
in this particular 
case the computing 
time of the correlat- 
ed quantities in- 
creases only slightly 
as compared to a 
straight-forward 
Monte Carlo calcu- 
lation. 


In Fig. 8.2 we com- 
pare for the problem 
described above the F 
convergence of two 
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X 
: 909 tte a a et 
independent Monte L : j : 
Carlo runs with a 309 10 1108 1586 
single multiple cor- No. of Batches (Histories/Batch: 4808 ) 





relation estimate as 


number of histories. Figure 8.2: Comparison of the convergence of Akl{ki-kz) 
obtained by two independent (uncorrelated) Monte Carlo 
calculations and a single multi-correlation estimate. 


IX. CONCLUSIONS 


The capacity and limits of the different algorithms have been illustrated in 
a comparative analysis of 3-D benchmarks dealing with prototypical thermal and 
fast reactor models. This exercise was organized by the NEACRP (54). Apart 
from validating the algorithms a number of "field" applications have been carried 
out. In particular the detailed 3-D analysis of different loop experiments placed 
in a reactor assembly of very heterogeneous structure should be mentioned in 
this context. The uncertainties of the perturbation estimates depend of course on 
the problem treated. As a general observation the convergence is good if only a 
few regions are perturbed, or if the perturbations all tend to have similar effects. 
It is also necessary that the user is familiar with the theory behind the different 
perturbation options to choose the one suited best for the solution of his problem. 
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The correlation option requires the construction of a reference case to limit 
the variance of the perturbation effect. Unfortunately there is no general rule by 
which reference equations can be constructed. This task depends to a large extent 
of the problem to be treated and will therefore be one of the more important sub- 
jects to be studied in Monte Carlo perturbation analysis. d—scattering will pro- 
bably play a substantial role in this context. 


Nevertheless, adding to a classical Monte Carlo particle transport code the 
algorithms described above provides a wealth of additional information at the ex- 
pense of a relatively modest supplementary programming and computation 
effort. To the best of the author's knowledge the next release of MCNP (58), one 
of the most widely used Monte Carlo particle transport codes, will have an 
option which allows for the calculation of cross-section sensitivity profiles. 


Another interesting new use of differential operator sampling is studied by 
Sarkar (49) together with the author. It aims at the optimization of non-analog 
Monte Carlo games and provides the potential to set-up self-learning Monte 
Carlo schemes. 


Right from the beginning it was realized that the Monte Carlo method was 
one of the most promising candidates specifically suited for parallel computers. 
But in the case of multiple correlation and differential operator sampling the ex- 
ecution time can in addition substantially be shortened if the perturbation scores 
are calculated by a vector processor. 
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1. INTRODUCTION 


Nuclear power reactors produce useful energy in the form of heat and electricity. 
They also produce hazardous waste in the form of fission products which may 
remain dangerous to life for many thousands of years. The problems associated 
with the disposal of this waste is an international one and a variety of techniques 
have been proposed for solving it (1). The basic difficulty is that of devising a 
containment system that will isolate the nuclear waste from the biosphere for a 
period of time that is long compared with the associated half-lives. This a moral 
as well as a technical problem since we have a responsibility to avoid exposure to 
future generations or indeed future civilisations. Whilst the moral aspects will 
not be discussed in this paper, they have to be borne in mind since they affect 
political thinking and hence the public funding of the waste disposal problem. 


In the past ten to fifteen years, the two most serious proposals for disposal have 
been burial at sea and burial in deep rock formations. Other proposals, such as 
disposal via rockets to the sun or further irradiation, have cost or technical 
difficulties that are considered insuperable at the present time. Burial at sea has 
met with strong environmental opposition and the idea has essentially been 
abandoned. While deep rock disposal is not entirely free of antagonists, it seems 
to be the only feasible solution and most of the current effort is directed towards 
understanding the technical problems and costs which will arise. In this paper, 
we will be concerned with disposal on land in stable rock formations. 


The main criterion in the selection of a site for a repository is that the host rock 
must have a history of stability, i.e. the rock formation shall have been quiescent 
in terms of earthquake or violent fracturing for hundreds of thousands of years. 
While the repository itself may occupy only some 3 or 4 km?, the rock formation 
may well have an areal extent of 300 000 km?. The area should also be free of 
any ore-bearing rocks likely to be of interest to man, since future generations 
may mine these areas when all knowledge of the repository has been lost. 


In addition to the above requirements, there is another one of considerable 
importance. Namely, that the influence of groundwater be minimised. Rock 
formations, by their very nature, contain fissures and fractures due to the effect 
of forces in the earth's crust. Inevitably, over a period of time, water will find 
its way into these void spaces and lead to a slow but finite advective motion 
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around the repository (2). Now, while the canisters in which the waste is stored 
may be very strong and backfilled with material of low permeability such as 
Bentonite, over several thousand years the canister contents will be leached out 
into the local groundwater. It is therefore important to be able to predict the 
motion of the groundwater and its likely variation with time over perhaps one 
million years. Such a time scale introduces a further variable into the system, 
namely climatic changes. On average, there is a glaciation every 100,000 years 
which affects climate and, due to motion of the ice sheets, can lead to erosion of 
the repository site and may even uncover it (3, 4, 6). 


It will be noted from the above discussion that the factors to be included in an 
assessment of a repository are many and varied. Geology of the site, human 
behaviour and climatic variation are simply headings to a host of sub-problems. 
Clearly, in an article of this nature it is not possible, nor desirable, to cover all 
these matters. Instead, we will concentrate on one aspect which has relevance to 
transport theory in a generalised sense, namely the modelling techniques that are 
employed to study the movement of radionuclides through fractured rock. Such 
a material may be regarded as a spatially random medium and so the problem 
becomes one of modelling groundwater and the dissolved radionuclides in a 
stochastic system. We will describe some of the methods currently employed to 
understand the problem and introduce some new ideas that have proved useful. 
Where possible, the methods used will be related to those of conventional reactor 
physics, thereby assisting the newcomer to the field to appreciate the inter- 
disciplinary nature of this field of activity. 


The plan of this article is, (a) to review briefly the past work on the subject of 
material transport in spatially random media, (b) to outline the general 
mathematical and physical principles behind the current methods of calculating 
the rate at which radionuclides move in fractured rock and (c) to describe a new 
method of dealing with (b) which is based on an analogy with neutron transport 
and which overcomes several of the shortcomings of the classical approach. 


2. REVIEW OF PAST WORK 


The transport of radionuclides through porous media is conventionally treated by 
means of the advective-dispersion equation as described by Bear (5). However, it 
is well-known (7) that dispersion depends sensitively on the spatially stochastic 
structure of the medium and in most cases has been shown to be scale dependent. 
That is, the dispersion length for a given material has a tendency to depend on the 
distance over which measurements are made or the time of travel of the solute in 
the body. There is, therefore, some reason to question the classical approach to 
this problem. In this review of the problem we will discuss the approaches 
adopted so far and attempt to elucidate their advantages and disadvantages. 
Before doing this, however, let us note a striking example of the scale problem 
mentioned above by referring to Figure 1 which shows the results of extensive 
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measurements of longitudinal dispersivity for many field sites around the world. 
These results are taken from Gelhar (39). It is clearly evident that the 
dispersivity (i.e. dispersion length), which is measured by using the classical 
dispersion equation, is a function of distance. A field test in which such 
behaviour occurs, indicates that the underlying physical model of the processes 
involved is unsatisfactory. It is clearly untenable that an ostensibly fundamental 
parameter of the system actually depends on the ‘size’ of the system. 


The movement of a solute through a porous medium is very complicated if a 
detailed microscopic view is required. Porous media are inter-connected sets of 
voids and solids, regularly or irregularly distributed. It is virtually impossible to 
describe deterministically the geometrical form of most porous media met in 
practice. For this reason a model has to be developed that will describe, to a pre- 
defined or generally accepted degree of precision the properties of the medium. 
Such a model is a differential or integral equation which is derived from 
theoretical and/or experimental observations. In the case of a porous medium, 
there are really two separate problems. Firstly, it is necessary to describe the 
porous medium itself in terms of its voidage and secondly the characteristics of 
the processes taking place in the porous medium have to be modelled, be they 
fluid flow or transport of heat. In the case of the movement of radioactive 
solutes, the porous medium has been characterised by porosity and dispersivity 
and the solute by its concentration. However, while concentration of solute will 
always be of importance as a measure of the radiation present, the description of 
the medium by porosity and dispersivitiy is certainly not sacrosanct. 


REUABILITY 


Longitudinal Olspersivity (m) 


+ low 


O Intermediate 


O high 





Figure 1; A compilation of longitudinal dispersivity values as a function of the 
scale distance travelled by the solute, inferred from various field tests 
(reproduced with permission of L.W. Gelhar and the American Geophysical 
Union) 
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Generally, the work carried out on this problem to date consists of two distinct 
approaches. The earliest approach accepted the concept of Fickian dispersion, 
i.e. an equation for the concentration C of the form (in tensor notation) 


oC oC aC 
&+y =D, 
or ED ox, "xx; 





and had as its goal the calculation of the dispersion tensor components, Dj; ,in 
terms of the geometrical properties of the porous medium. 


Later work recognised the possibility that the dispersion equation itself may not 
be a true representation of the way in which solute moves. Thus a more 
fundamental approach was adopted in which the geometrical properties of the 
medium were described statistically and the movement of solute also described 
statistically by means of stochastic differential equations. As will be seen, such an 
approach leads to interesting conclusions about solute motion not predicted by 
classical theory. 


There still remains, however, certain anomalies concerning those methods that 
employ stochastic analysis to calculate the mean square distance of travel of a 
marked particle and those that start from the stochastic partial differential 
equation for the concentration. 


Probably the most far-sighted of these early methods was that proposed by 
Taylor in 1921 (8) in which he followed the fate of a marked particle undergoing 
discrete random jumps. The introduction of the velocity correlation function and 
its relationship to the mean square distance of travel was a tour de force that has 
guided workers on statistical problems ever since. In 1952, Taylor (9) introduced 
an entirely different approach in which he envisaged a well-defined network of 
capillaries and attempted to solve the advection-diffusion equation in that network 
to obtain an effective dispersion coefficient for the overall system. This idea has 
also been pursued by others, in particular Aris (10) and Brenner (11). In a 
related manner, Saffman (12, 13) has also set up a capillary network but with 
randomly varying orientations and has calculated the mean square distance of 
travel through such a network. 


One of the shortcomings of the mean square distance of travel technique is its 
rather limited ability to make any comment about the overall spatial distribution 
of solute or the equation obeyed by it. The best that can be done is to note that 
the most general function completely defined by a mean value and a variance is 
the Gaussian distribution. This in turn satisfies an equation of the advective- 
dispersive type with the values of water velocity and dispersivity calculated from 
the mean and variance. This is the approach of Dagan (7) and it certainly meets 
with considerable success. However, the mean square distance of travel is 
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actually an infinite medium method. One may ask whether it is acceptable to 
deduce a balance equation from infinite medium considerations and then apply it 
without further ado to finite medium problems. This matter has yet to be 
addressed. 


A different, and yet in many ways related, method is that of the stochastic 
differential equation as employed by Gelhar and his associates (14, 15). In this 
method, which is well-developed in other fields, the classical balance equation for 
the solute is written down and then various parameters within it, such as flow 
velocity and dispersivity, are identified as being stochastic. If the equation can be 
solved, then the stochastic concentration may be expressed in terms of the 
stochastic properties of the input, and any desired average calculated. Of course, 
it is rarely possible to solve such stochastic equations and so perturbation methods 
are employed which lead to solutions under certain restrictive conditions. For 
quantities like mean square distance of travel, the results of the perturbation 
analysis usually agree with those which employ the mean square distance of travel 
technique directly. However, the stochastic differential equation can make 
predictions about the shape of the mean concentration and its higher moments 
which may sometimes be non-Gaussian, thus it does seem to be more flexible. 
This is only apparent, however, because the mean square method can be extended 
to calculate higher spatial moments and these will constitute a measure of the 
deviation of the associated concentration distribution from normality. It is then 
possible to construct a new, non-Gaussian function from these moments which 
must be strictly equivalent to the non-Gaussian results arising from the stochastic 
differential equation method. Indeed, the seeds of the method are contained in 
the work of Gelhar et al. (14) when they expand the solution of their non- 
Gaussian equation in terms of a Gauss-Hermite series. 


Having made these comments about the apparent superiority of the stochastic 
differential equation method, one ought to inject a word of caution. For 
example, what right do we have to assume that the classical balance equation in 
stochastic form actually describes the physical situation. Although it is derived 
from the continuity equation, it nevertheless contains the assumption of Fick’s 
law, J,=-DVC. Is this valid, even in a stochastic sense? This matter does not 
seem to have been raised in the literature but would seem to merit some 
discussion. We very briefly outline our concerns here. Firstly, we note that 
even if we use Fick’s law as it stands in the stochastic sense, it contains essentially 
two unknowns, D and C. It is our aim, for example, to obtain an equation for 
<C>, but also we need information about D. Such information depends on the 
functional dependence of D on the physical properties of the porous medium and 
of the fluid flowing through it. Thus to set D = aU and then to relate a and U to 
the properties of the medium via hydraulic conductivity and porosity is one 
possibility. However, there are several other equally acceptable expressions for 
D, for example those due to Saffman (12, 13) or Brenner (11). Presumably this 
matter could be settled by fixing a flow regime but since U is a stochastic 
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variable covering perhaps very small up to quite large values, this may not be too 
easy. We leave that point for the future. Our final comment concerning the 
validity of a stochastic Fick’s law stems from the relation 


J,=-DVC + f, 


where f, is a Langevin noise source with mean value zero. The equation for <J j> 
is therefore still of classical type but the stochastic form permits an unknown 
fluctuation. A detailed review of the techniques developed to solve stochastic 
differential equations associated with transport in fractured rock may be found in 
Williams (/6). 


3. MODELLING METHODS 


To assess the rate at which radionuclides move through the host rock it is 
necessary to carry out three phases of work: 


(1) Model the groundwater flow in and around the repository. 

(2) Evaluate the release rate from the repository, i.e. calculate the source 
term. 

(3) Model the radionuclides dissolved in the water and adsorbed on the rock. 


A detailed treatment of item (3) is the main purpose of the paper, but a brief 
discussion of items (1) and (2) will be helpful to set the scene. 


3.1 Groundwater Flow 





If we regard rock as a porous medium, then it is part solid, part void. In a 
saturated medium, such as we consider here, the void space is filled with water 
and it is the motion of this water that needs to be predicted. 


The law of fluid flow in porous media is that due to Darcy. This law relates the 
flux of fluid to the driving forces (2). If q is the specific discharge (Darcy 
velocity) then it is related to the pressure P in the following way 


q- EVP - pp) (1) 


where k is called the permeability tensor, w is the fluid viscosity, p the fluid 
density and g the acceleration due to gravity. We note that q is not the fluid 
velocity in the pores; this is given by u=q/e, where £ is the porosity, q is in fact 
the flux of fluid per unit area of the rock (i.e. voids and solid). The permeability 
is a measure of the ease with which water passes through the rock. 
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If the fluid has constant density it is possible to write 
P =P, + pyg(z— 2p) 


Zo being some reference level in the vertical direction. Then we may write 


k 
q--—VP, 
H 


or in the terms of the head h= P, / pyg 
q=-K.Vh Q) 


where K = ppgk/p is the hydraulic conductivity tensor. In a steady state flow, we 
have from continuity 


V.q=0 (3) 


V.K.VA=0 (4) 


This equation, with its boundary conditions, may be used to predict groundwater 
movement in saturated media. For unsaturated media, the situation is more 
complex (5). In general, the region of interest will extend over several rock 
types so that eqn. (4) will have to be solved numerically. Several computer codes 
exist to do this (4). It should be borne in mind that classical groundwater 
calculations assume that K is a well-defined deterministic quantity, whereas the 
present discussion regards K as a stationary random function of space. Eqn. (4) 
is therefore a stochastic differential equation. Clearly, therefore, fluctuations in 
K lead to fluctuations in h and hence in the Darcy velocity q. More will be said 
about this below. 


3.2 The Source Term 


Bearing in mind that the repository is initially well-contained, we may assume 
that during the early stages of burial t < T, the radionuclides decay without any 


transport. Thus for a given initial inventory S, Bq m~3, if the container breaks 
down at ¢ = T, the source strength available for dispersion is $,; exp (-A,;T) for 


each nuclide i. However, the maximum leach rate sets an upper bound on the 
rate of radionuclide release. Thus if L; is the maximum flux of radionuclide i 


released by leaching, the current into the geosphere from the repository will be 
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J, = min[vS,,,L,Je* with v the velocity. (5) 


Other, more complex release rates may prevail if the pH or other 
physicochemical properties change with time. 


3.3 Classical Radionuclide Transport 





The standard procedure for assessing the concentration of radionuclides in a 
porous medium is via the classical advection-dispersion equation, (2), viz 


£ =V.D.VC - V.(UC) -4C (6) 


where C = C(r, t). For simplicity, we have assumed only a single radioactive 
decay chain. D is the dispersivity tensor and U the groundwater flow velocity 
(not the Darcy velocity). This equation describes transport due to the overall 
average advective motion and also that arising from the tortuous paths in the 
interstices of the host material. Such motion, which is mechanico-geometric in 
origin, is regarded as giving rise to a Fickian behaviour for the current. This is 
an assumption and more will be said about it below. 


adsorption 
desorption 





Figure 2 : Adsorption-desorption process between rock and solute 
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Eqn. (6) is fundamental to most radionuclide transport problems and so it is 
worthwhile spending some time on it because, as written above, it is incomplete. 
As water-borne radionuclide is transported through rock it comes into contact 
with the rock by virtue of the interface between water and rock in a void or 
fracture. Figure 2 illustrates the point. A process known as adsorption occurs 
wherein the dissolved contaminant becomes attached, via physico-chemical 
forces, to the rock. There is also a reverse reaction in which desorption occurs, 
returning the contaminant back into the water. This phenomenon plays a crucial 
role in containment of radioactive nuclides as we shall demonstrate. 


Let C be the concentration of the solute (mass per unit volume of water) in the 
water and let F be the concentration (mass per unit volume of rock) in the rock. 
Then we can write down two balance equations involving C and F. 


In the water, 


e SE = eV.D.VC-eV.(UC)-2eC- f(F,C) (7) 


and in the rock 
OF 
(1-€)>-= -(1- e)F + f(F,C) (8) 


In eqn. (7), the removal terms arise from dispersion, advection, radioactive 
decay and finally adsorption. In eqn. (8), removal is by radioactive decay and 
replenishment is via adsorption. The adsorption term f(F,C) denotes the rate at 
which rock takes up radionuclide. f is the so-called adsorption isotherm and 
various analytical forms exist for this. The simplest is the linear isotherm, such 
that (5) 


f=10@6C-F) (9) 


where T is a time constant governing the rate of the adsorption-desorption 
process. Eqns. (7) and (8) are the governing equations and are very similar to 
the equations of reactor kinetics with one group of delayed neutrons. 


The two equations may be simplified by adding them, when we find 
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Now, in practice, the time constant T is very short compared with times over 
which C or F change appreciably (i.e. hours or days compared with hundreds of 
years). In that case, we may assume that T=0 or, equivalently. F = BC. 
Physically, this implies that the concentrations of radionuclides in rock and water 
are in equilibrium. Eqn. (10) now becomes 


Sex -€)B)C = eV.[DVC - UC] - A[e+ (1-e)B]C (11) 


If we define the retardation factor R as 


l-e 
R=1+— 
= 


where B= K,p, (Kg being the distribution coefficient and pẹ the rock density) 
then equation (11) becomes 


2rc= V.[DVC -UC]~ ARC (12) 


If R is independent of position and time then eqn. (12) becomes 


9 euylPyc_Ucl_ 
2c=v {Pvc Zc] ac (13) 


Now this equation is identical to eqn. (6) provided we use the effective 
parameters D/R and U/R. In other words, the effective dispersion coefficient 
and the effective advective velocity are reduced by a factor of R compared with 
the actual values. Values of R vary widely for different radionuclides. For 
example, Table I (6) shows values of R for transport in granite for a variety of 
elements. We note that, for I, R = 1 whereas for Pu it varies between 20 and 
2000. The large variation arises from experimental uncertainty due to lack of 
knowledge about pH and Eh and precise details of the rock material. The 
practical consequences of the introduction of R is that, given a mixed source of 
radionuclides at time zero in the repository, at some time later the isotopes will 
have separated spatially due to their different effective transport parameters. 
Iodine always travels fastest. This phenomenon is the basis of chemical 
chromatography and the background theory is well known (17). For a 
repository, the effective reduction is of great benefit since the radioisotopes are 
contained within the geosphere longer thereby allowing radioactive decay to have 
a greater depletion effect before emergence to the biosphere occurs. Eqn. (13), 
and its modifications due to multiple decay chains, is the working equation for 
many proprietary computer codes for assessing contamination. 


RADIONUCLIDE TRANSPORT IN FRACTURED ROCK 151 


[Element |R_ value | 
Sr 





Table 1: Retardation coefficients in granite 


Before further detailed discussion, it is important to note that theoretical and 
experimental studies of the dispersion tensor D, have shown that its components 
Dij take the form (2, 5) 


Dy =m ôU +(a, ~ ap) (14) 


where U = (U,? + U,? + U7)! and U; are the orthogonal components of U. ay 
and a, are the transverse and longitudinal dispersion lengths (or dispersivities), 


respectively. They are functions of the microscale geometry of the porous 
medium but not usually of water velocity. 


In principle, D should contain the molecular diffusion coefficient but in the cases 
we shall consider this is generally negligible compared to dispersion. We do 
note, however, that important practical situations can arise, for example in dead- 
end pores, where molecular diffusion cannot be neglected. 


We have described the porous medium in terms of a dispersion coefficient. Such 
a description has its limitations as we have noted in our earlier comments on 
scale-dependence; however, in some rock formations, the concept of dispersion is 
no longer meaningful. This arises when the rock is traversed by fractures. 
These large scale faults can be regarded as ‘short-circuits’ which allow 
contaminant at one position in the rock to be transported very rapidly over a 
large distance virtually by pure advection. Dispersion is a much slower process 
because it involves a tortuous path. To illustrate the difference between normal 
porosity and a fracture, consider Figure 3 which is a two-dimensional view of 
fractures in rock showing the typical range of sizes of grains and fissures. It is 
clear that over scales of centimetres to metres, the path through most fissures is 
quite tortuous and is described by dispersion. However, the wide fractures allow 
transport over a much larger distance with little resistance. Thus there are 
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essentially two scales in the rock: that due to the small scale fissures and that due 
to the large fractures. A form of ‘double porosity’ exists. A classical advection- 
dispersion equation cannot adequately treat this situation without some 
modification. Methods have been devised to deal with the situation (78) in which 
the classical equation is used in the small scale region and is related to another 
advective-dispersion equation in the fracture which has different dispersive 
properties. Such an approach is quite popular but does not address the observed 
failure of the classical equation in the sense of scale-dependence. 


IRI 
IKIE 


Figure 3 


Two-dimensional view of microfissures in rock showing typical 
scales of imperfections 


The discussion thus far suggests that there are serious limitations associated with 
the classical advective-dispersion equation. This is indeed so, although many 
proprietary computer codes continue to use it, presumably because of its 
simplicity and the absence of anything better. The extension of the classical 
equation, with the stochastic modifications noted earlier, could be regarded as 
leading to a more appropriate formulation. Unfortunately, these stochastic 
models lead to sets of highly intractable equations, the accuracy of which is 
uncertain and which are complex to deal with numerically. Also it is not always 
clear how to obtain the necessary stochastic properties of the rock that are 
required input data. For these reasons, there seems to have been no serious 
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attempts to incorporate analytic stochastic methods into working computer codes. 
Nevertheless, a stochastic approach has been adopted and has received 
considerable attention. It is based on Monte Carlo simulations of radionuclide 
and groundwater transport. Essentially, random numbers are used to generate a 
number of overlapping lines (fractures) according to a probability law specified 
by a mean fracture length and orientation distribution. Then, marked particles 
are tracked through these fractures according to an appropriate equation of 
motion determined by fluid flow and other processes such as adsorption (7). A 
large number of histories of particle behaviour is accumulated and appropriate 
averages are taken (19). This method is computationally very time consuming 
and depends upon a knowledge of the underlying probability laws of fracture 
distribution and flow characteristics in the fracture. Unless, these are known 
accurately, it is debatable whether the effort is justified. Nevertheless, the 
approach is potentially accurate and avoids many of the limitations of the classical 
advection-dispersion equation. 


It is our purpose to describe now a procedure which retains the physical appeal 
of the classical advection-dispersion equation, but includes the sophistication of 
the Monte Carlo technique, without the limitations of either method. The 
remainder of the paper will be dedicated to this purpose and some illustrative 
numerical results will be presented. 


4. ANEW APPROACH TO RADIONUCLIDE TRANSPORT 


A major shortcoming of the classical advection-dispersive equation is its failure 
to explain scale-dependence as explained above. This suggests a fundamental flaw 
in the physical arguments employed to justify its use and encourages us to look 
more closely at the underlying assumptions. To aid in our quest we consider 
more carefully the geometrical construction of fractured rock. In general, it 
may be regarded as a medium in which there are sets of intersecting planes (the 
fractures). The intersecting paths constitute a connected network along which 
water and its associated solute may travel. Thus if we consider a marked particle 
in the fluid, motion will consist of a series of, more or less, rectilinear flights 
followed by a sudden change in direction as the particle meets an intersection and 
is diverted into a new direction. To draw an analogy with kinetic theory, or 
more precisely with neutron transport, the rectilinear sections constitute a mean 
free path and the intersections a scattering collision. Figure 4 illustrates the point 
with A and B as intersections of fracture planes. The ‘mean free path’ is A-B and 
an intersection is a “scattering centre’. Such events are occurring throughout the 
fractured medium and statistically can be quantified by a Boltzmann-like 
transport equation. 
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Figure 4: Intersection of fracture planes 


4.1 Derivation of Balance Equations 





To incorporate these ideas into a balance equation, we must introduce the concept 
of an ‘angular concentration’ C(r, v, Q, t). Thus C(r, v, Q, t)drdvdQ is the 
concentration of marked particles (or solute) associated with the transporting 
fluid, in the volume element dr, at r, travelling with a speed between v and v + 
dv in a fracture with direction Q in the solid angle dQ at time 4. In addition, we 
must define the concentration adsorbed into the rock surface, which bounds the 
fracture, as F(r, Q, t). We note that the velocity does not arise because the solid 


is stationary. 
J, j 


v 


Figure 5:Angular co-ordinates for the fracture-flow system 
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Figure 5 illustrates the concept more clearly, where we have denoted by the unit 
vector k a reference direction which could be for example the net flow direction 
of the water. Thus the ‘direction’ of the fracture, denoted by the unit vector &, 
makes an angle cos! (Q. k) with the reference direction. It is also important to 
note that @ is fixed by the rock structure and so has a different meaning from the 
scattered direction which arises in neutron scattering. In the latter, the scattered 
direction or direction of motion of the neutron, is infinitely variable and will 
depend on Q.Q' where Q" is the direction before scattering. In the case of 
motion through rock, the directions of motion are fixed but, of course, they still 
have a statistical distribution f(€2.k) because fractures are not all lined up in the 
same direction. Thus f(Q.k} is a measure of the spread in the fracture 
orientation with respect to some prescribed direction k. 


Let us now construct balance equations for C and F in a medium with porosity €. 
The equation for C will be 


q2 + yQ,V + vE(r, Q) + a leenan 


= ef dQ’ | dv'v' Etr, glr; > O,v’ > v)Clr, v, X,t) ~ f (CF) + ES, v, t) (15) 
and that for F 

ə 
a-e 3 +2 Fene £1C.F) (16) 


The structure and derivation of eqn (15) follows that of the neutron transport 
equation (20, 21) and arises by considering a balance in phase space dr dy dQ. 
The meaning of the various parameters in the equations is as follows: 


X(r, Q): This quantity is better understood as 1/2 which is the mean free path 
or, in this application, the mean distance between fracture intersections. Clearly 
it does not depend on water velocity v but may depend on position r in a 
heterogeneous medium and could also depend on the fracture orientation. For 
example, oblique fractures may have shorter distances between intersections than 
more obtuse ones. £ is obtained from measurements on the rock. 


g(r;Q’ 9 Qv =v): Physically, gf...) dv dQ is the probability that, on reaching 
an intersection point, a marked particle in the water with speed y” and travelling 
in the fracture whose orientation is Q', will be diverted into the new fracture 
whose orientation is in the solid angle dQ about Q and achieve the new speed v in 
the range (v, v + dv). Clearly, 
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favj dage + Qv ~ v)=1 (17) 


The form of g will depend on the rock geometry and the fluid velocity and it will 
be discussed it in more detail below. 


The term S in eqn (15) is an independent source term and A is the radioactive 


decay constant of the nuclide. The extension to a chain of nuclides will be 
discussed. Finally, we note that fs (C, F) is the adsorption term and denotes the 


rate of loss of dissolved contaminant to the rock surface. f, is the so-called 


adsorption isotherm and various analytical forms are available (2). The simplest 
such form is the linear isotherm, which is 


f,==(BC-F) (18) 


where T is a time constant and B = K4 Ps, Ps being the rock density and Kg the 
distribution coefficient (19). 


Eqns (15) and (16) are coupled equations for C and F and are reminiscent of the 


way in which delayed neutrons are represented. If we assume that adsorption and 
desorption are in equilibrium, we may set 


F=K,p,C (19) 
Now adding eqns (15) and (16) and using (19), we find 

E + ale +(1-€)K,p,)C + e[vQ.V + vE]C 

= ef dQ’ [avv Er, gew > Av’ 3 v)C(r,v’,.O’,t) + ES(r, v, Q,t) (20) 
Dividing by € and defining the retardation factor R by 

R=1+(1-e)K,p,/€ (21) 
we can write 

E + a|r C+[vQ.V +vE]C 

= fax | avv Er, Mge; > Q, v 3 v)C(r, v, Q’,t) + S(r, v, Q,t) (22) 


Eqn (22) is a conventional neutron transport equation and is therefore amenable 
to many well-developed solution methods, both numerical and analytical. 
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Naturally, the space and time scales are very different from those encountered in 
neutron transport theory. For example, fluid velocities v are of the order of 
metres per year and values of 1/2 may be up to a kilometre in sparsely fractured 
rock, although they may also be of the order of centimetres in some finer- 
grained structures. 


Eqn (22) may be extended to describe a radioactive decay chain and may then be 
written 


FE + ajre +[vQ.V +vE]C, 


= fa f avv Er, Yg; o- Q, v’ -> v)C, (r, v’, Qt) + r MEN AE Sa TE y S (r,v,Q,t) (23) 


It should be noted that the velocity v = v(€) because the fluid velocity depends on 
the fracture width as does the porosity €. 


4.2 Generalisation to Overlapping Fracture Sets 


The derivation above is confined to a rock in which there is a single set of 
fractures with the same flow and orientation properties. In practice, real rock 
formations contain sets of overlapping fractures with widely differing scales. 
Flow can occur through the legs of a given fracture set at a speed characteristic 
of the fracture. There will also be routes by which flow can occur from one 
fracture set to another. Thus, while several scales of flow may exist, 
characterised by a fracture set, there will be communication between the sets and 
hence a mixing of radionuclides. As an example of these differing scales, we 
may have a rock medium in which there are large scale fractures with mean 
distance between intersections of several hundred metres. However, the rock 
surrounding these fractures will have its own microfissures of perhaps several 
centimetres through which water is also flowing. The velocities involved and the 
orientation functions will be very different but both can be modelled 
simultaneously by an extension of the above theory as we now show. 


Let us assume that the medium contains N distinct fracture sets and let C ATs V, 
Q, t) be the angular concentration in fracture set k. We may now write, by a 


generalisation of the arguments used in Section 4.1, the following balance 
equations 


[r z +vQ.V + vz, (Q)+ AR [e.no 


Me 


J dO’ f dvv Eg Y >Q, v > v)C, (r, v, Qt) + S(r, v, 2,2) 


i 


J 


(24) 
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where k = /, 2,...N 


This equation is for a single radionuclide and Ry, is the retardation factor 
associated with fracture set k. X}; is the intersection probability per unit length 
between fracture sets j and k. More physically, 1/2,; is the average distance 
between an intersection in fractures of type j and one in type k. Clearly 


AEA (25) 


l=] 


The quantity g,; (...Jis the probability that a ‘marked particle’ with speed y’ in 
fracture set j oriented in direction Q" will be deflected into speed v in fracture set 
k oriented in direction Q. 


It is important to note that in terms of C, the actual measured concentration of 
radionuclide (e.g. Bq m~3) is defined as 


C(r,t)= $ Í dQ Í dvC, (r, v, 9t) (26) 


k=l 


The net current of radionuclide (Bq m~? s~!) is 
N 

Jer) = F, f dQQ] dw0,(r,v,0,0) (27) 
kal 


These equations are analogous to those of multigroup neutron transport theory 
(22). 


5. THE NATURE AND MEANING OF THE SCATTERING TERMS 


Before embarking on the practical applications of the above equations we digress 
to a discussion of the basic quantities, = and g and define them in terms of 
measurable rock parameters. 


We consider first the calculation of £ where 1/Z is the mean distance between 
intersections. This problem is related to percolation theory and has been 
reviewed and developed by Robinson (23) and by Sahimi (40). For a network to 
be percolating it is necessary that the fracture density (i.e. number of fractures 
per unit area) be above a so-called critical density. The critical density is defined 
as that density above which infinite clusters of fractures appear or in other words 
when one part of the network can communicate with any other part no matter 
how far the two points are apart. If a network is below the critical density, its 
effective hydraulic conductivity is zero. We shall always assume that the critical 
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density is exceeded but methods are available for calculating it in 2 and 3 
dimensional structures based on Monte Carlo sampling techniques (23). 


5.1 The Number of Intersections per Fracture 





For the purpose of our theory we need to calculate the number of intersections 
per fracture. To illustrate this, we give an example. Consider a system with 
fractures of a fixed length £ oriented either horizontally or vertically with equal 
probability. The fracture density p is such that in a region of area A there are N 
= pA fractures. Any randomly chosen pair of fractures will intersect if they are 
orthogonal and if the centre of one lies within a square of side £ around the 
other's centre. Ignoring the effect of boundaries, the probability that the two 
fractures intersect will be 
i 
P=7A (28) 


Since the process is a binomial one, the probability that a given line intersects r 
other lines is 


P, = T(N -IXN-2).(N -- p)" pa- py" (29) 


As we allow A — œ, this reduces to 


OEE o 


Thus the distribution of the number of intersections is a Poisson distribution and 
has a mean value 


n= (31) 


A more general expression for Np, can be obtained in terms of a fracture 
orientation function g(@). We shall derive such a function for the extended case 
when the fractures have different lengths and orientation distributions. Let 4, 4, 
be the line lengths and g,(@)and g,(6) the orientation distributions. Also let p,be 
the linear fracture frequency (i.e. the number of fractures that intersect a line of 
unit length when all fractures are horizontal). Now choose two lines, one of 
length 4, at orientation @; and the other of length £, at orientation 82. Then the 
number of segments intersecting £, over the angular range d@ will be 


Pa Pa, tila Sin! 8, ~ 0,140, (32) 
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The average number of segments intersecting £, over the range @ will be 
s x/2 

PaPa, t | dt,t,F;(4,) | dO, sin! 8, -0,1 8,(0,) (33) 
0 =R]2 


where F,(£) is the probability density function for the line length. 


If £, is also distributed at random according to F,(4,) and g,(@;), we have for the 
average number of intersections per fracture (or the connectivity) 


n/2 xj2 
M2 = PnP bil, [d0, [d0,sinl0, -8,1 8,(0,)8,(0,) (34) 
=R/2 -=xr]2 
If the fractures are identical, N,, reduces to 
_. "i2 zj 
N= pi È d0, |do, sin, - 6,1 g (8,)g (6,) (35) 


-Ki2 -xr]2 


where the areal fracture density p = p}, and £ is the mean fracture length. It is 
readily shown that if we set 


1 1 n 
g(8) = 5 6(8) +5 5(6- >) (36) 


ie. orthogonal lines, we regain eqn (31). 


Now the mean distance between intersections will be 


(37) 


2\~ 
tim 


In the case of different line types, we must distinguish between Z,, and £,,. To 
do this consider Figures 6a and 6b below. In Fig 6a, the line AB is the distance 
between a 1-1 intersection and a 1-2 intersection. In Fig 6b, the line CD is the 
distance between a 2-2 intersection and a 2-1 intersection. The average of such 
distances over the network in Fig 6a is 1/Z,, and the average in Fig 6b is 1/Z,,. 
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Figure 6a 





Figure 6b 


Now £,, =" and È, =% where we note that Nj. = N3;. In general then 


2 1 
Sas 5 (38) 


There are some forms of g(@) that are particularly useful in practice. For 
example, suppose that 


2(@) = A ; -as@<a 
2a 
=0 ; outside this range (39) 


Physically, this distribution indicates that the fractures are uniformly distributed 
within a given range (~a, a). 


If we use eqn (39) ineqn (34), we find 


A i oo S 
Ni =P Pr ral dO, | dO, sinl, ~ 6, (40) 


-0 
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where we have set 
PaPe, = NPP (41) 
pı and p, being the areal densities of fracture sets 1 and 2, respectively. 


Evaluating the integrals in eqn (40) leads to 


5 œ — sin œ cosa 
is 1 i 
Na = V¥P,P244 2 iA SA 


as (42 
= a-Si ) 
= Jap, 4l, ae cea, 
a, a, 
For a = Q2 = Q, $, = £, = £ and p)= p: = p, eqn (42) becomes 
Napi ee eels (43) 


2a 
which agrees with Robinson's result (23). 


Thus we now have a well-defined procedure for calculating Ł; We also note 
that È is not a function of Q unless the rock fracture angle distribution function 
g(Q) varies spatially. We shall assume statistical homogeneity throughout. 


5.2 The Angular Scattering Distribution 


A more difficult problem is posed by the function g,{...). It is clearly dependent 
on the way in which water flows through the fracture and on the angle through 
which it is diverted at an intersection. For the sake of simplicity we shall make 
the following assumption, viz: 


By (X > Yv > v) = (v - v.) fy) (44) 


This assumes that as soon as a fluid element from fracture set j enters fracture set 
k, it acquires the velocity characteristic of fracture set k, i.e. vy. It also assumes 
that the direction travelled in the rock after the deflection at an intersection 
depends on the orientation of the new fracture and not on the previous one. This 
is clearly exact since the fracture angle in the rock is fixed. 


If we now define some reference direction, say by the unit vector n, the 
orientation of the fractures can be described by 
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f(Q. n)= fy) (45a) 

where u=cos8. 

But we know from the work in Section 5.1 that iil H) is related to g,(@) by 

fy (dye = 8,,(0)d0 (45b) 

We have assumed that the direction distribution into which the marked particle is 
deflected in the new fracture, k, does not depend on the nature of the fracture 
that the particle is leaving. The scattering term g is therefore defined in terms of 


known rock and flow properties. 


5.3 The Flow Velocity in a Fracture 





It is necessary to calculate the value of v, in eqn (44), i.e. the water velocity 
along the fracture set k. Let p, be the number of fractures per unit area of rock 
face perpendicular to the net flow direction, then if A, is the area of the aperture 
of a single fracture, the total area available for flow through the fracture set k is 
p, Ay per unit area of rock. The volumetric flowrate through fracture set k will 
be v; P A, and the total volumetric flowrate 


Q, = EPA (46) 
k 
If Z is the mean flow velocity through the network, we have, by definition 


TPA = Ler, (47) 


Now assume that flow through the fractures is laminar, and that a fracture has 
width t,. The velocity through it is then given by (19) 





=- IVBI 
vy = Ju 124, i? (48) 
where 


wid 
f= [d0cosbg,(8) (49) 


-2/2 
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is a directional factor which affects the velocity in fractures that are at varying 
angles to the net flow. fy is the viscosity of the fluid and VA, is the driving 
pressure gradient. 


Using eqn (48) in eqn (47), we find 


_ IVR] LPs A fut _ IVA | LPabutt t 


50 
2y, Ioa T24, Son i 


Now if we know @ from measurements, or can specify # from other conditions, 
we can calculate 1VA,I/12u, from eqn (50) and hence obtain v, from eqn (48) in 
the form 


Let; 


v, = fee (51) 
DiS igt} 
2 KHI 


6. THE SIMPLIFIED EQUATIONS FOR OVERLAPPING FRACTURES 


We have seen in Section 4.2 how we may obtain a set of equations for the angular 
concentration in each fracture set. We have also seen in Section 5 how the 
scattering term may be reduced to a relatively simple expression in terms of 
obtainable rock properties. Let us now combine these results to obtain a set of 
working equations. Thus we set 


C, (r, v, 9,1) = 5(v — v, )ġ, (r, Q,t) (52) 
S,(r,v,Q,1) = lv - v,)Q, (r,Q,1) (53) 
and obtain 


N 
[RS+yaV +v,2, HAR aeoo = Vv 2yfj(Q) Í dQ (r, Q,t)+ Q, (r, 9,1) (54) 
jsi 


where k=1,2,...N 


It is also evident that Q,(r,Q,t) = Q, (r,t) fa (Q) because the source particles can only 
travel in the fracture directions. 


These equations are coupled transport equations analogous to the conventional 
multi-group neutron transport equations (22). They are therefore amenable to 
all of the solution methods developed for neutron transport calculations. Indeed, 
it is possible that existing codes can be modified with little effort to solve eqn 
(54). 
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We shall return to these equations below when we consider practical examples. 
At this point, however, it is worth considering the physical meaning of the above 
equations. Let us do this for a single fracture set, the equation for which may be 
written 


[ros WAV +v +R loroo = V(O f dQ’O(r, X,t) + (r,t) AA (55) 


This equation describes the motion of a marked fluid particle in a network of 
inter-connected fractures. Although the particle moves in a straight line within a 
fracture, its overall motion is random (subject to prescribed probability laws). 
The occasional changes in direction lead to a dispersive effect, i.e there is an 
angular mixing which causes some particles to move further in the flow direction 
than others at a given time. This dispersive motion arises quite naturally and 
does not require any ad hoc diffusive mechanism to cause it. We shall later see 
that a pseudo-diffusion equation can indeed be derived from eqn (55) but one that 
is entirely self-consistent. In that respect, it is useful to consider an alternative 
representation which also leads to dispersion. We do this by returning to eqn 
(24) and considering it for a single fracture set but writing 


U(X > Qv > v) = Giv 3 vô- Q’) 
Cir, v, 9,1) = f(r,v,)6(Q-Q,) 
Sir, v, 2,1) = O(r,v,)d(Q-Q,) 


Eqn (24) then becomes 


(z + ye + zjav = xf dv'v'G(v 3 v)f(4,v',.0+ 0,0 (56) 
ð Ox 


where we have set Q,.V=0/ 0x, x being the direction of flow. 


Eqn (56) is an integro-differential equation in velocity, space and time analogous 
to eqn (55) which is in direction, space and time. Just as angular variations lead 
to dispersion, so do velocity variations, thus similar dispersion effects, due to 
velocity differentials, arise. If we approximate the scattering kernel G as 


G(v > v) = F(v) (57) 


a not unreasonable assumption physically, eqn (56) may be solved analytically 
and the dispersive character illustrated directly (24). For fractured media, we 
would not recommend the use of eqn (56) since it eliminates the fracture 
properties, nevertheless it has been used by Impey and Grindrod (25) to validate 
some Monte Carlo simulations for mass transport in stochastically generated 
fractures. 
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This general discussion sheds some light on the failure of the classical advection- 
dispersion equation which is clearly limited in the way that it can represent the 
true properties of the medium. 


7. A PHYSICAL INTERPRETATION OF THE NEW EQUATION 


By considering the special case of N = 2, we may obtain an instructive physical 
interpretation of the new model. Consider the two coupled equations obtained by 
setting N = 2 in eqn (54). We find 


[r z +y. V +v (2a + Za) +R, faao 


= v Efa (Q) | dQ’, (02/1) +v Enf Df dgr, Y, +O, lt) (58) 


[r 2 +V. V + v,(Ly +È) taR Jo,00.2.0 


= v En fa (Qf dV Enp X,+ Enfal Vf dX Enp (1,2’,1) + Or, 1) (59) 


The meaning of the gradient and the radioactive decay terms are self-evident, but 
the scattering term deserves some explanation. Let us consider eqn (58). The 
first term of the right hand side is the number of marked particles scattered from 
one l-type fracture to another l-type fracture at an intersection. The second 
term on the right hand side is the number scattered into a 1-type fracture from a 
2-type fracture. These are therefore effective source terms. The terms on the 
left hand side of the equation involving Z,, and Ł,, define the rate at which 
particles are removed from a l-type fracture to another 1-type fracture (Z,,) and 
the rate at which particles are removed from a 1-type fracture to a 2-type 
fracture (Z,,), Eqn (59) has an analogous meaning for particles in a 2-type 
fracture. It is readily shown that the equations obey overall particle balance. 


An interesting feature of these equations arises if we set v, = 0 and Q, = 0. 
These are fractures in which no flow occurs, i.e. dead end pores. Then the 
equations become 


[r24 vV +v (Ent En) +AR [øE 


= VE, fa Df dV Alr, X,t) +O r,r) (60) 


R, [2 + a ran =v Eyfa (Of dG, X,t) (61) 
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The concentration in dead-end pores then follows closely the distribution of @,. 
Integrating eqn (61) over Q, we find 


R, F $ a Jott. =v En a(r,t) 62) 
which leads to 


Gaa(tst) = MEn RS de> olr) (63) 
which shows the in-situ decay of radionuclides in dead-end pores and their 
retarding effect on transport. 


8. THE EIGENVALUE STRUCTURE OF THE NEW EQUATIONS 


Consider the one-dimensional, steady state form of eqn (55), viz: 
ð 1 
[uZ ve2 fove.u)= vyu faroa) (64) 
-1 


where v denotes v/R. 


This is an unusual form of transport equation because the scattering function f is 
a function of u where u is the angle that Q makes with the z-axis, i.e. 4 = Q.z 
Normally, scattering functions are functions of Q.Q', i.e. scattering angle before 
and after collision. In the present problem, however, Q and Q' are defined by 
the rock structure and are not free. An analogous situation has been noted by 
Ganapol (26) in his studies of radiative transfer through leaf canopies. 


For fundamental reasons, and also because it guides us in developing approximate 
solutions, it is important to understand the nature of the eigenvalue spectrum 
associated with eqn (64). Details of the procedure may be found in Cassell & 
Williams (24) but we briefly review the results here. 


Consider an elementary solution of the homogeneous form of eqn (64) in the 


form exp(a@z). Such solutions will exist if roots œ can be found to the following 
transcendental equation, where f(y) # f(-2), 


1 
l= afu (65) 
AL+—+ Oy 
Vv 


If we set @=2/(2+A/v), and replace @/(L£+A/v) by @ eqn (65) reduces to 
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1 

isa jar (66) 
41+ gH 


in which we note that 0<w <1, and 


faww=1 (67) 


The investigations of Cassell and Williams (27) show that two roots to eqn (66) 
exist, viz: Of, and -Q, (1), Qt > 0). Thus two elementary solutions exist of the 
form exp(@, z) and exp(-a, z). This, in turn, suggests a diffusion-like behaviour 
of the asymptotic part of the solution, asy, and we expect Pasy to satisfy an 
equation of the form 


(E-aldenpati-o G 


This may be further written 


À 1 1 
-m AP om, Sy on =0 69 
0,0, Pay {2 a, | co ary (69) 


In a more familiar notation 
Doiy -Upay ~ Apay =0 (70) 
The first term in eqn (70) is due to diffusion or dispersion, the second to 


advection and the third to absorption or decay. In conventional neutron 
diffusion, Œ; = Œ; and U, the advective velocity, is zero. In the present, more 


general, case the asymmetry in f(u) about u = 0 leads to net flow in the positive 
or negative direction, according to the magnitude of 


fafan (71) 


Knowledge of this behaviour is of considerable value in constructing approximate 
solutions as we shall see below. 


9. AN ASYMPTOTIC TIME-DEPENDENT TRANSPORT EQUATION 


While the transport equations in Section 6 can be solved directly by numerical 
means, it is useful to derive a simpler equation which is amenable to analysis. 
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This is the purpose of the present section and it relies on the properties of the 
eigenvalues deduced in Section 8. 


We shall consider eqn (55) in the following form 


re] v v v t: ag a 1 
E H toT Apcem= pw] au C(x, k’) +2 OOF) (72) 


For notational simplicity we set v/R = v. 


It will be recalled that the spatial eigenvalues œ are deduced from 
1 

[voy + vE + AJF) = Vf (u) | PU dy’ (73) 
-1 


Thus there are two asymptotic solutions of the form 


Z+—+au 
v 


F(u)= 


where &= -Op Qf). 


Following experience gained in neutron transport theory (20, 21, 29), we now 
seek a solution to eqn (72) in the form 


O(X, st) = Fy(U)Ag (x,t) + F (uM) By(x,0) (75) 
Thus from eqn (65) we find that the concentration is given by 

(x,t) = Ay (x,t) + By(x,0) (76) 
Inserting eqn (75) into eqn (72) and integrating over u(-1, 1) we find 

[D, +A +vF,D,]Ay +[D, +4 + vF,D,]B, =Q (77) 
Now multiply eqn (75) by u and integrate over u(-1, 1) to get 

[F D, + AFy + VEFy, + VFyD, —vEfi |A +[F,D, + AR, + VER, + vR D, -vEf]B,=Of, (18) 


where 


1 z 
F;= | duu'F(u) (19) 
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= f dauf (u) (80) 
D,=0/at and D,=d/ax. 


Eliminating Ag and Bọ in turn, we find the following equation for the 
concentration $p, 











1 v£ +24 v E 
D? + D,+ Fn-Fa p 
Ee 'vEta ' vE+A Fy -F, Dale 


-|Z Ef- -|f 4 FaFa |p aly, 
w+ K-A, 7 \vE4+aA vE+a R-F, 


ada „fanz 81 
Ae- El- Fi =fa bp, “p ia 


We may now define some familiar variables as 








pa? FFn -Foha (82a) 
E+A/v Fy -F, 
VER A Fy-Fo (82b) 
i + aa a à R- FF, 
(82c) 
” Se ae v = a 
-2+2 (82d) 
vVE+A 
1 
c—— 82 
VI+A ee) 
=P; 
=v paai (82f) 
vias Fy — Fy ) 
Then eqn (81) becomes 
g’ CH Pe y2 
eae roo [o ae = U alaro (83) 
where 
ð sø 
Q= [{ž- 7 2) io (84) 


Evaluating the integrals F;;, we may simplify eqns (82a-f) to 
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A, 1 











ms Ge, = men 
U=- a lit} (85b) 
ora'i) as 
v -firit siei) (85d) 
where 4,=A/vd and @=a/(L+A/v)=a/Z/(1+A,). 
In the limit of zero decay (A= 0 ), these expressions reduce to 
D= E (86a) 
U= fv (86b) 
1=-1/%Z (86c) 
role ao 


The structure of eqn (83) is of considerable interest and we call it a generalised 
Telegrapher's equation. In a more restricted form such an equation was 
proposed in an ad hoc fashion by Scheidegger (28) but no fundamental 
explanation was given. In the present work, the equation is a natural outcome of 
a consistent approximation to the transport equation. Physically, eqn (83) is of 
the advective-dispersive type with the added effect of a wave-front arising from 
the second-order differential in time. The cross term in x and t shows that a 
disturbance will propagate with different speeds in the +x and -x directions. 
These are all features that we would expect from transport in the fractured rock. 
The added advantage of the Telegrapher's equation is that all of the parameters 
are uniquely defined in terms of rock properties. 


It is instructive to write eqn (83) in the following form 


O22 2 le ok Oe?) pn A 
(2+ zyz á 2), +2(2442 hp, =o 7% (87) 


This construction shows the presence of characteristic velocities vt, y` and u/K. 
The ways in which they enter the calculation will become evident later. 
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10. SOLUTION OF THE GENERALISED TELEGRAPHER'S EQUATION 
The generalised Telegrapher's equation, eqn (83), may be solved analytically for 
an arbitrary source term Q(x, t). The explicit details can be found in Williams 

(30). Here we outline the procedure. 
A Green function Gfx - Xp, t) is defined by the following equation 


TË +1G’ +KG = DG” -UG' -AG+ 5(x— x, )5(t) (88) 


We also define the new function w(x,t) by 


mF ae A 
nand? V 2) +1} yex0 (89) 
Then 
YORE) = Fy [7 dG- xot = fey sto) (90) 


hence o from eqn (89). 


The calculation of G is straightforward, if tedious, and the result for a plane 
source where 


Q(x,1) = 0 6(x)S() (91) 
is as follows. 


If we define #=xX, f=vLt, D=vD,/ £, l=h/ E, U=vU, ‚t= t/v}, vt = vš, 
B=v=2B,, V" =W, c=vEc, then after some algebra,we find: 


For 0< X¥svot 


afi + eh — J yexp(—c,3 / v3) 

= exp((U, — col) /2D,) Vo Vo 

aana a (92) 
0 o *o. $ J digs -iE f Eio) 


Ely; 


For -vwisx%<0, we find 
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rf - a + Žjexp(což 1v) 
exp((Ua — Colo)ž / 2D) Vo Yo 


vx = So (+ 4D, T)? 


h (93) 
+ faist- ipe fai) 


-i/y, 


where 


fixt)= i - ef + Case) patne] 


a Oe cr ed Cd poea o9 





with 
1/2 
ole- ea) 
Vo Vo (95) 
c= 2D +UI 
P+4Dt (96) 
whl fe, Ya e 
p=ta(S+ Ss) apt ay (97) 
P ‘ 1/2 
r= ex 5) (98) 
1/2 
vi È? + 2) pE 
4r? T 2T (99) 


where /g(x)and /;(x)are modified Bessel functions. 


It is clear that a plane source will spread and propagate in both +x directions 
with the leading edges having speeds v* as explained earlier. In addition, the 
pulse itself will move with an advective velocity U. As the pulse moves so it will 
acquire a diffusive character with the source shape gradually being ‘forgotten’. 
The corresponding classical advective-dispersion solution can be obtained by 
allowing t, 30. Then 


| a aii oti 
C(x, t) = ——2_, | Tesli- i)e > exp -=e 
lá (x ) (47D)? fg“ oe epl 4Dyiy (100) 


We shall compare numerical solutions of these equations in the next section. 
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11. NUMERICAL RESULTS 


11.1 Comparison of Telegraphic and Advective-Dispersion Solutions 





In order to illustrate the new model, we will consider a number of situations and 
rock properties. Our first aim is to compare the solution of the generalised 
Telegrapher's equation with that of the classical advective-dispersion equation. 
We do this by taking a source in the form of a simple square wave in time which 
has a duration of one unit of time, ie 1/v£ = 1. We also remind the reader that 
distance is in units of 1/Z. The fracture angle distribution is taken to follow a 
picket fence model with f(4) defined in the following way. 


f(w=10 ; -0.9<p<-0.85 
= 19.0; 0.85< u <0.9 (101) 
F(t) is zero elsewhere in u(-1, 1). Using Ao = 10-4, we calculate the roots œg and 


a, of eqn (65) and thence find 2, = -2.386 x 10°?, Ug = 0.7874, Dy = 0.7875 and 
fı = 0.7875. 


A series of calculations is now presented graphically to illustrate the limitations 
of the advective-dispersion equation. Fig 7 shows the concentration from the 


Telegrapher's equation as a function of distance at times T = 2, 5, 10. Fig 8 
shows the same information using advective-dispersion theory. 
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Figure 7 Concentration (Telegrapher) vs distance 
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Figure 8 Concentration (Diffusion) vs distance 





o 50 100 150 200 
Figure 9 Concentration (Telegrapher) vs distance 


The marked wave front of the Telegraphic solution should be noted and the way 
in which, at early times, it preserves the source shape. A shortcoming of the 
advective-dispersive solution is its instantaneous response leading to no wave- 
front and therefore a marked tendency to over-estimate the concentration at 
distances from the peak and to under-estimate the peak value. 
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Figure 10 Concentration (Diffusion) vs distance 


To show how the comparison between the two methods changes as time increases, 
Figs 9 and 10 show the concentration for T = 20, 50, 100, 200 as a function of 
distance, Fig 11 shows a more detailed comparison of Telegrapher and advective- 
dispersive solutions at T = 100. 
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Figure 11 Concentration vs distance at T = 100 


We note that the Telegrapher's equation shows some structure of the source up to 
T = 50 but beyond that the resolution of the plotter is unable to reproduce it. We 
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also note that the detailed shape of the source is no longer present but that the 
general features of the discrepancies between Telegrapher and advective- 
dispersive solutions still remain, viz: overestimates at large and small distances 
and underestimates of the peak value. 





o 5 10 15 20 25 
Figure 12 Concentration (Telegrapher) vs time 
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Figure 13 Concentration (Diffusion) vs time 


An alternative way to illustrate the concentration is by the conventional break- 
through curve where concentration is plotted versus time at given positions. Figs 
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12 and 13 show the concentration at x = 2, 5, 10 and Fig 14 compares break- 
through curves for Telegrapher and advective-dispersive solutions at x = 5. 
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Figure 14 Concentration vs time at x = 5 





0 50 100 150 200 250 300 
Figure 15 Concentration (Telegrapher) vs time 


The most obvious feature of these curves is the position of the break-through 
point. For the Telegrapher's equation, this is an abrupt point corresponding to 
the wave-front, whereas the advective-dispersive equation predicts an 
instantaneous response with an over-estimate of the concentration at early times 
and an under-estimate of the peak value. 
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o 50 100 150 200 250 300 350 


Figure 16 Concentration (diffusion) vs time 


At distances further from the source, viz: x = 20, 50, 100, 200, Figs 15 and 16 
show the break-through curves, with Fig 17 giving more details at x = 100. 
Again we see the expected behaviour although there is, as distance from the 
source increases, a tendency for the advective-dispersive and the Telegrapher's 
solutions to become closer. 
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Figure 17 Concentration vs time at x = 100 
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11.2 An Explanation of Scale Dependence 





One of the major shortcomings of conventional advection-dispersion theory is 
that it appears to be scale-dependent (3/). That is, in order to obtain agreement 
between experiment and theory, it is necessary to increase the dispersion 
coefficient as the distance over which measurements are taken increases. If 
advective-dispersion theory were truly representative of radionuclide transport 
through fractured rock, the dispersion coefficient would be a fundamental 
parameter and independent of the system size or boundary conditions. In order 
to obtain further insight into this phenomenon, we have attempted to fit the 
advective- dispersion solution to the Telegraphic one by adjusting the dispersion 
coefficient D, until the peak value of the advective-dispersive solution is equal to 
that of the Telegraphic one. We have done this for concentration versus distance 
at fixed times and for concentration versus time for fixed positions. As a result, 
an effective dispersion coefficient can be obtained as a function of time or 
position. 
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Figure 18 Effective dispersion coefficient 


We have observed that even with matched peaks, the advective-dispersion 
solution tends to overestimate the concentration at large distances and 
underestimate it at small distances. However, as the time increases, the modified 
advective-dispersion solution becomes close to the Telegraphic one. Fig 18 
shows how the effective dispersion coefficient increases with time and tends 
asymptotically to value of 0.185 D,. Similarly, when the fitting is done against 
time for fixed distance, Fig 19 shows the behaviour for a range of x-values up to 
1000. The same monotonically increasing behaviour is noted. 
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Figure 19 Effective dispersion coefficient 


A rough fit can be obtained to the behaviour of D,/D,as follows 


Dy _ eed A 

D, = 0.189) exa( 5) (102) 
and 

Dy _ i 

= = 0.1891 -en(-545)| (103) 


These expressions are approximately consistent with the relation x = U, t, with Ug 
= 0.83. Thus the spatial relaxation length for the dispersion coefficient to reach 
its asymptotic value is 77.5 and the corresponding relaxation time is 93.5. We 
remind the reader that distance is measured in units of average fracture length 
and time in units of average fracture transit time. Thus, if the average fracture 
length is 1 m, the relaxation length defined above is 77.5 m. If the effective 


velocity of the radionuclide is 0.1 m y~}, then the relaxation time is 935 years. 
The general trend of these numerical results is consistent with that seen 
experimentally and deduced by other theoretical workers in this field. It may be 
concluded that the Telegraphic solution is more in accordance with experimental 
data than the advective-dispersive one. However, this conclusion is only tentative 
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until further calculations can be carried out using more accurate solutions of the 
transport equation. 


11.3 The Influence of the Fracture Orientation Function on the Concentration 





In this section we consider the form of f(u) derived in Section 5 for a fracture 
orientation function g(@) that is uniform between (-@, œ), i.e. eqn (39). The 
value of È is obtained from eqns (38) and (43) and we find 


=2a-—sin2a 


The form of f() is deduced from eqn (45b). Then if 


i= >; -asé<sa 
2a 


=0 ; elsewhere in (-1,1) (105) 


we readily see that 


1 1 
(u)=— ; cosaspsl 
fu er m 


=0 ; -lsp<cosa@ (106) 


If eqn (106) is used in eqn (66), we find that the eigenvalues @, and -@, are roots 
of 


avi-@? (= (2) 
tan ———— |= | — tan| — l 

( 20 ive) FAZ di 
Note that œ is maximum rock orientation whereas @ is the desired root. The 
actual values must be obtained numerically and hence also the values of Dg, Ug 
etc. 


Since we wish to illustrate our results in real time and space variables, we note 
that 


2a 
= * 07 Qa—sin2a) (108) 
z'] 2a? 
«tpt, (2a —sin2a) (109) 


We study the evaluation of a pulse of radionuclide of 0.306 years duration as it 
moves out to 500 metres over a period of some 10 years. Table 2 gives values of 
the main parameters for a range of œ from zero to 7/2. 


RADIONUCLIDE TRANSPORT IN FRACTURED ROCK 183 


TABLE 2 





We shall display results at various times and distances. The following data is 
used: p = 6 x10°3 m? {p= 50 m and v = 100 my": 
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Figure 20 
Concentration at 6 years for a range of fracture angles 
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Figure 21 
Concentration at 500 metres for a range of fracture angles 


Fig 20 shows the pulse at six years as a function of distance, for a range of values 
of & from 0.2 to 4/2. The concentration at @, t is denoted C(a@, t). The results 
are in full accord with our physical intuition. For small values of a@, the 
fractures are closely parallel to the direction of average flow and the motion is 
almost entirely advective with little angular mixing. As œŒ increases, we note 
increasing dispersion and smoothing of the source shape. For @ = 7/2, i.e. more 
or less randomly distributed fractures over the whole angular domain, the 
concentration adopts the characteristic dispersive shape but with a well-defined 
wave front. However, we hasten to add, that the shape is by no means Gaussian 
and is therefore not predictable by the conventional advective-dispersive model. 


Fig 21 shows the corresponding break-through curves at 500 metres for the same 
range of Œ values. The concentration at (a,x) is denoted C(a,x). For small a, 
the source propagates virtually unaffected by the medium with a well-defined 
wave-front. As before, as Œ increases, a dispersive effect develops due to the 
mixing between different orientations and an overlapping of the wave fronts. 


This is a simple case, but it illustrates that the model contains the essential physics 
of transport through fractured media and that the properties of the fractures are 
realistically reflected in the solution. More complex situations, with overlapping 
fracture sets, can only be dealt with using numerical methods. 
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11.4 Numerical Methods of Solving the Transport Equation 





The analytical solutions described above are of considerable value for gaining 
insight into the general structure of the solution and for ascertaining the effect of 
the various rock parameters. In practice, however, real rock formations tend to 
be heterogeneous and over the areal extent of interest to a waste repository, it 
may be necessary to study the flow of radionuclides through a variety of rock 
strata. Also, of course, many of the radionuclides of importance to risk 
assessments are created from decay chains of the type described by eqns (23). It 
is clear, therefore, that a numerical approach to the solution of the equation is 
necessary. There is a strong possibility, although to date this has not been 
examined in any detail, that existing time-dependent neutron transport codes 
could be adapted to solve equations of the type described above. Some 
modification to the scattering function would be required because this is now a 
function of Q.k rather than Q.Q', and also the magnitude of the parameters such 
as mean free path and particle velocity would differ. Nevertheless the basic 
structure of the equation is very similar to that of neutron transport and so it is to 
be expected that a code such as the time-dependent version of ANISN would be of 
value. 


Some advances in the numerical aspects have been made by Buckley et al (32, 33, 
34) based on a discrete version of eqn (72) in the form 


N 
(2 + Wy vba Jon) = vEa, ¥ 9, (x,1) + a,5)(x,0) (110) 


jsi 
In arriving at this equation it was assumed that 


f= F aðu- p) (111) 
k 


where [ly refers to the rock orientation distribution. In particular, a model in 
which N = 2, such that 


F(M) = a,6(u ~ p) +a blh - p) 


which is called for obvious reasons the “forward-backward’ model, is quite 
helpful because an exact solution can be obtained against which the numerical 
techniques may be compared. 


Numerically, equations of the above type can be difficult to solve because of the 
shock-like behaviour, i.e. the rapid change in shape near the wave-front. In 
order to guide the numerical analyst, a number of techniques have been 
employed, viz: (1) Upwind method (34), (2) Smolarkiewicz method (36) and (3) 
NND scheme (37). The upwind method suffers from numerical dispersion but 
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has the advantage of simplicity and ease of programming. Smolarkiewicz's 
method notes the dissipative problems of the upwind method and introduces a 
numerical diffusivity term which helps to eliminate them. Finally, Zhang and 
Zhuang (37) have developed a series of numerical schemes which were originally 
designed to accommodate the complex flow fields associated with shock waves. 
Numerical modelling of such shocks leads to spurious oscillations near or in the 
shock region. In the past, modelling attempts have been concerned with utilising 
mixed-dissipative first-order schemes containing free parameters near the shock 
point, and second-order models elsewhere, with resulting unsatisfactory 
oscillations. Thus, Zhang and Zhuang have suggested a scheme which is non- 
oscillatory, contains no free parameters and is dissipative (as opposed to 
dispersive, which implies oscillations), which they designated the NND method. 


Concentration 





-1.0 -0.5 0.0 0.5 1.0 
Figure 22 a: Forward Direction at t = 1.0 s 


Buckley et al (32) have used all three methods for the forward-backward model. 
An extensive set of calculations may be found in that paper, but we give below in 
Figs 22 a,b,c an example which shows the general behaviour. It is the general 
conclusion that the modified NND schemes are the most promising due to their 
avoidance of oscillations and their characteristically smaller dissipative error. 
While the upwind method was computationally more efficient for the simple 
backward-forward model, it is also apparent that the explicit form of the 
Smolarkiewicz and NND methods are warranted for more complicated systems 
involving several directions. 
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Figure 22 c: Forward Direction at t = 100.0 s 


A further conclusion of the numerical work, which is based on calculations using 
the fracture angle models of eqn (101) and eqn (106), is that the generalised 
Telegrapher's equation is an excellent approximation (30). 


Finally, Buckley et al (34) have studied the equations which arise from a 
radioactive decay chain. In particular, they examine how the radionuclides in the 
chain ™U =” Th» Ra move through fractured rock up to 1 km from the 
source and up to 50,000 years. The rock orientation function is that in eqns 
(104) and (106) and calculations are made for a range of values of œ. The 
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characteristic smoothing of the concentration distributions are noted as a 
increases towards 7/2. The results from the transport equation were compared 
with those from classical advective-dispersion theory. This latter is shown to be 
in significant error especially for smaller values of œŒ, i.e. when the fractures are 
nearly parallel. It is also shown that when fracture transport is dominant, a large 
number of angular segments is needed for good accuracy. 


12. CONCLUSIONS AND SUMMARY 


The representation of a spatially random medium is a complex matter and has 
been the subject of much study for more than 70 years. The method of defining 
the structure of such a medium depends crucially on the nature of the problem. 
Thus, for example, a random medium through which electromagnetic waves are 
travelling, will be modelled entirely differently from the case when a liquid is 
moving through it. This is because electromagnetic radiation passes through the 
solid and void components, albeit with different effectiveness, whereas in the case 
of a liquid, only the voids are accessible. Thus in the liquid case, we are 
concerned with the topology of the fractures or defects, whereas with 
electromagnetic radiation, the changes in the bulk properties such as density are 
more relevant. 


The tortuosity of the fracture network is regarded in the present work as a series 
of paths, along which liquid passes. Each path crosses other paths and, at the 
intersections, liquid passes with some probability from the original path to a new 
one. We have drawn the analogy with neutron transport in which a path 
corresponds to a ‘mean free path’ and an intersection to a ‘scattering collision’. 
The advantage of such a representation is that the geometry of the paths is readily 
incorporated into a balance equation by virtue of the ‘angular concentration’. 
This has no readily useable physical value but, when averaged over the fracture 
orientation, it leads to the concentration and current of the dissolved contaminant. 
There is also the added advantage of dealing with a familiar equation, namely the 
neutron transport equation. 


Calculations based on this equation, using well-known approximation methods, 
lead to results that are physically in accord with experiment. In particular, we 
can demonstrate how the fracture orientation influences the shape of the 'break- 
through’ curves and concentration profiles; with parallel fractures leading to 
very sharp advective features, and randomly orientated ones to dispersion-like 
behaviour. A further significant contribution of the present technique lies in its 
explanation of the scale-dependent effect in which the classical dispersion length 
appears to depend on the 'size' of the system. This matter is clarified without any 
appeal to stochastic differential equations, but simply by the use of an appropriate 
statistical distribution of the fracture orientations within the balance equation. 
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While we have developed the transport equation in a three-dimensional form, it 
has only been solved in essentially one-dimensional geometry. The most 
important extension required, therefore, is to a truly three-dimensional structure. 
This will necessitate development of the scattering term g(...) and the interaction 
probability 2. In many ways the use of the transport equation developed here is 
the deterministic analogue of the Monte Carlo approach, favoured by many 
workers, who track marked particles through computer generated fracture sets. 
It is well-known that such a Monte Carlo approach is very inefficient numerically 
so that we have an additional advantage of the transport equation approach. 
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